Koordinator核心调度算法
基于 Node 节点的真实 CPU 与内存使用率来进行 Pod 调度,而不是 Pod 的请求资源。
负载感知调度
场景
- 当节点的资源利用率达到高阈值时,节点上正在运行的工作负载之间会发生严重的资源争用;
- 集群中的工作负载具有不同的资源需求,有的工作负载对 CPU 资源的利用率要求比较高,而有的工作负载对内存资源的利用率要求比较高;
- 由于节点异常,在调度过程中应避免此类节点,以防止出现意外异常;
目标
- 提供可配置的调度插件来帮助控制集群资源利用率;
- 资源利用率控制机制支持多种资源;
- 将资源利用率控制在安全阈值;
设计
调度流程:
- 控制器把将要创建的pod同步到api server中。
- kubelet从koordlet拉取pod信息,并上报给APIserver中。
- koordlet将Node和Pods的资源使用情况报告NodeMetric。
- 调度新的Pod通过watch/list的机制监听到pod中字段nodename为空的进行调度。
- APIserver推送NodeMetric给调度器。
- 调度器通过扩展插件进行过滤,打分,保留等操作后,调度pod后进行绑定。
调度插件扩展了 Kubernetes 调度框架中定义的 Filter/Score/Reserve/Unreserve 扩展点。
评分算法的核心逻辑是选择资源使用量最小的节点,但是考虑到资源使用上报的延迟和 Pod 启动时间的延迟,
时间窗口内已经调度的 Pod 和当前正在调度的 Pod 的资源请求也会被估算出来,并且估算值将参与计算。
源码
负载感知调度的启动参数如下所示:
// LoadAwareSchedulingArgs holds arguments used to configure the LoadAwareScheduling plugin.
type LoadAwareSchedulingArgs struct {
metav1.TypeMeta
// FilterExpiredNodeMetrics indicates whether to filter nodes where koordlet fails to update NodeMetric.
// Deprecated: NodeMetric should always be checked for expiration.
FilterExpiredNodeMetrics *bool
// NodeMetricExpirationSeconds indicates the NodeMetric expiration in seconds.
// When NodeMetrics expired, the node is considered abnormal.
// Default is 180 seconds.
NodeMetricExpirationSeconds *int64
// ResourceWeights indicates the weights of resources.
// The weights of CPU and Memory are both 1 by default.
ResourceWeights map[corev1.ResourceName]int64
// UsageThresholds indicates the resource utilization threshold of the whole machine.
// The default for CPU is 65%, and the default for memory is 95%.
UsageThresholds map[corev1.ResourceName]int64
// ProdUsageThresholds indicates the resource utilization threshold of Prod Pods compared to the whole machine.
// Not enabled by default
ProdUsageThresholds map[corev1.ResourceName]int64
// ScoreAccordingProdUsage controls whether to score according to the utilization of Prod Pod
ScoreAccordingProdUsage bool
// Estimator indicates the expected Estimator to use
Estimator string
// EstimatedScalingFactors indicates the factor when estimating resource usage.
// The default value of CPU is 85%, and the default value of Memory is 70%.
EstimatedScalingFactors map[corev1.ResourceName]int64
// Aggregated supports resource utilization filtering and scoring based on percentile statistics
Aggregated *LoadAwareSchedulingAggregatedArgs
}
// ScoringStrategyType is a "string" type.
type ScoringStrategyType string
const (
// MostAllocated strategy favors node with the least amount of available resource
MostAllocated ScoringStrategyType = "MostAllocated"
// BalancedAllocation strategy favors nodes with balanced resource usage rate
BalancedAllocation ScoringStrategyType = "BalancedAllocation"
// LeastAllocated strategy favors node with the most amount of available resource
LeastAllocated ScoringStrategyType = "LeastAllocated"
)
-
FilterExpiredNodeMetrics指定是否过滤 Koordlet 无法更新 NodeMetric 的节点; -
NodeMetricExpirationSeconds表示 NodeMetric 过期时间,单位为秒;当NodeMetric过期时,节点被认为异常,默认为180秒; -
ResourceWeights表示资源的权重,默认情况下,CPU 和内存的权重都为1; -
UsageThresholds表示资源利用率阈值,CPU 的默认值为65%,内存的默认值为95%。 -
EstimatedScalingFactors表示估计资源使用情况时的系数,CPU 的默认值为85%,内存的默认值为70%。
负载感知调度初始化如下所示:
type Plugin struct {
handle framework.Handle
args *config.LoadAwareSchedulingArgs
// Pod 元数据
podLister corev1listers.PodLister
// Node Metrics 指标元数据
nodeMetricLister slolisters.NodeMetricLister
estimator estimator.Estimator
// Pod 缓存
podAssignCache *podAssignCache
}
func New(args runtime.Object, handle framework.Handle) (framework.Plugin, error) {
pluginArgs, ok := args.(*config.LoadAwareSchedulingArgs)
if !ok {
return nil, fmt.Errorf("want args to be of type LoadAwareSchedulingArgs, got %T", args)
}
if err := validation.ValidateLoadAwareSchedulingArgs(pluginArgs); err != nil {
return nil, err
}
frameworkExtender, ok := handle.(frameworkext.ExtendedHandle)
if !ok {
return nil, fmt.Errorf("want handle to be of type frameworkext.ExtendedHandle, got %T", handle)
}
// Pod 缓存
assignCache := newPodAssignCache()
// Pod List-Watch 监听机制
podInformer := frameworkExtender.SharedInformerFactory().Core().V1().Pods()
frameworkexthelper.ForceSyncFromInformer(context.TODO().Done(), frameworkExtender.SharedInformerFactory(), podInformer.Informer(), assignCache)
podLister := podInformer.Lister()
// Node 指标元数据
nodeMetricLister := frameworkExtender.KoordinatorSharedInformerFactory().Slo().V1alpha1().NodeMetrics().Lister()
estimator, err := estimator.NewEstimator(pluginArgs, handle)
if err != nil {
return nil, err
}
return &Plugin{
handle: handle,
args: pluginArgs,
podLister: podLister,
nodeMetricLister: nodeMetricLister,
estimator: estimator,
podAssignCache: assignCache,
}, nil
}
基于 scheduler framework 调度框架拓展了 FilterPlugin、ScorePlugin、ReservePlugin 三个插件,如下所示:
var (
_ framework.EnqueueExtensions = &Plugin{}
_ framework.FilterPlugin = &Plugin{}
_ framework.ScorePlugin = &Plugin{}
_ framework.ReservePlugin = &Plugin{}
)
Filter 拓展插件如下所示:
func (p *Plugin) Filter(ctx context.Context, state *framework.CycleState, pod *corev1.Pod, nodeInfo *framework.NodeInfo) *framework.Status {
node := nodeInfo.Node()
if node == nil {
return framework.NewStatus(framework.Error, "node not found")
}
// 如果是 DaemonSet,则不参与过滤
if isDaemonSetPod(pod.OwnerReferences) {
return nil
}
// 获取 node 指标数据
nodeMetric, err := p.nodeMetricLister.Get(node.Name)
if err != nil {
// For nodes that lack load information, fall back to the situation where there is no load-aware scheduling.
// Some nodes in the cluster do not install the koordlet, but users newly created Pod use koord-scheduler to schedule,
// and the load-aware scheduling itself is an optimization, so we should skip these nodes.
if errors.IsNotFound(err) {
return nil
}
return framework.NewStatus(framework.Error, err.Error())
}
// 获取不到 Node 指标数据,则不参与过滤
if p.args.FilterExpiredNodeMetrics != nil && *p.args.FilterExpiredNodeMetrics &&
p.args.NodeMetricExpirationSeconds != nil && isNodeMetricExpired(nodeMetric, *p.args.NodeMetricExpirationSeconds) {
return nil
}
filterProfile := generateUsageThresholdsFilterProfile(node, p.args)
if len(filterProfile.ProdUsageThresholds) > 0 && extension.GetPodPriorityClassWithDefault(pod) == extension.PriorityProd {
// 过滤生产环境 Pod 资源使用率
status := p.filterProdUsage(node, nodeMetric, filterProfile.ProdUsageThresholds)
if !status.IsSuccess() {
return status
}
} else {
var usageThresholds map[corev1.ResourceName]int64
if filterProfile.AggregatedUsage != nil {
usageThresholds = filterProfile.AggregatedUsage.UsageThresholds
} else {
usageThresholds = filterProfile.UsageThresholds
}
if len(usageThresholds) > 0 {
//
status := p.filterNodeUsage(node, nodeMetric, filterProfile)
if !status.IsSuccess() {
return status
}
}
}
return nil
}
func (p *Plugin) filterProdUsage(node *corev1.Node, nodeMetric *slov1alpha1.NodeMetric, prodUsageThresholds map[corev1.ResourceName]int64) *framework.Status {
if len(nodeMetric.Status.PodsMetric) == 0 {
return nil
}
// TODO(joseph): maybe we should estimate the Pod that just be scheduled that have not reported
podMetrics := buildPodMetricMap(p.podLister, nodeMetric, true)
prodPodUsages, _ := sumPodUsages(podMetrics, nil)
// 生产环境的 Pod 资源(CPU、Memory等)利用率
for resourceName, threshold := range prodUsageThresholds {
if threshold == 0 {
continue
}
// 获取 Node 节点已分配的资源
allocatable, err := p.estimator.EstimateNode(node)
if err != nil {
klog.ErrorS(err, "Failed to EstimateNode", "node", node.Name)
return nil
}
total := allocatable[resourceName]
if total.IsZero() {
continue
}
used := prodPodUsages[resourceName]
// 资源利用率
usage := int64(math.Round(float64(used.MilliValue()) / float64(total.MilliValue()) * 100))
if usage >= threshold {
// 超过阈值的 Pod 不可调度
return framework.NewStatus(framework.Unschedulable, fmt.Sprintf(ErrReasonUsageExceedThreshold, resourceName))
}
}
return nil
}
func (p *Plugin) filterNodeUsage(node *corev1.Node, nodeMetric *slov1alpha1.NodeMetric, filterProfile *usageThresholdsFilterProfile) *framework.Status {
if nodeMetric.Status.NodeMetric == nil {
return nil
}
var usageThresholds map[corev1.ResourceName]int64
if filterProfile.AggregatedUsage != nil {
usageThresholds = filterProfile.AggregatedUsage.UsageThresholds
} else {
usageThresholds = filterProfile.UsageThresholds
}
for resourceName, threshold := range usageThresholds {
if threshold == 0 {
continue
}
allocatable, err := p.estimator.EstimateNode(node)
if err != nil {
klog.ErrorS(err, "Failed to EstimateNode", "node", node.Name)
return nil
}
total := allocatable[resourceName]
if total.IsZero() {
continue
}
// TODO(joseph): maybe we should estimate the Pod that just be scheduled that have not reported
var nodeUsage *slov1alpha1.ResourceMap
if filterProfile.AggregatedUsage != nil {
nodeUsage = getTargetAggregatedUsage(
nodeMetric,
filterProfile.AggregatedUsage.UsageAggregatedDuration,
filterProfile.AggregatedUsage.UsageAggregationType,
)
} else {
nodeUsage = &nodeMetric.Status.NodeMetric.NodeUsage
}
if nodeUsage == nil {
continue
}
used := nodeUsage.ResourceList[resourceName]
usage := int64(math.Round(float64(used.MilliValue()) / float64(total.MilliValue()) * 100))
// 过滤 Node 节点资源(CPU、Memory等)利用率超过阈值的节点
if usage >= threshold {
reason := ErrReasonUsageExceedThreshold
if filterProfile.AggregatedUsage != nil {
reason = ErrReasonAggregatedUsageExceedThreshold
}
return framework.NewStatus(framework.Unschedulable, fmt.Sprintf(reason, resourceName))
}
}
return nil
}
func buildPodMetricMap(podLister corev1listers.PodLister, nodeMetric *slov1alpha1.NodeMetric, filterProdPod bool) map[string]corev1.ResourceList {
if len(nodeMetric.Status.PodsMetric) == 0 {
return nil
}
podMetrics := make(map[string]corev1.ResourceList)
// 从 nodeMetric.Status.PodsMetric 获取资源使用率
// 其中 nodeMetric 指标数据是从 Koordlet Daemonset 中采集的
for _, podMetric := range nodeMetric.Status.PodsMetric {
pod, err := podLister.Pods(podMetric.Namespace).Get(podMetric.Name)
if err != nil {
continue
}
if filterProdPod && extension.GetPodPriorityClassWithDefault(pod) != extension.PriorityProd {
continue
}
name := getPodNamespacedName(podMetric.Namespace, podMetric.Name)
podMetrics[name] = podMetric.PodUsage.ResourceList
}
return podMetrics
}
func sumPodUsages(podMetrics map[string]corev1.ResourceList, estimatedPods sets.String) (podUsages, estimatedPodsUsages corev1.ResourceList) {
if len(podMetrics) == 0 {
return nil, nil
}
podUsages = make(corev1.ResourceList)
estimatedPodsUsages = make(corev1.ResourceList)
for podName, usage := range podMetrics {
if estimatedPods.Has(podName) {
util.AddResourceList(estimatedPodsUsages, usage)
continue
}
util.AddResourceList(podUsages, usage)
}
return podUsages, estimatedPodsUsages
}
Filter 插件的主要功能是通过资源阈值过滤 Node 与 Pod。
- Node CPU 与内存资源利用率是否超过阈值;
- Pod CPU 与内存资源利用率是否超过阈值;
Score 拓展插件如下所示:
func (p *Plugin) Score(ctx context.Context, state *framework.CycleState, pod *corev1.Pod, nodeName string) (int64, *framework.Status) {
// 获取 Node 节点信息
nodeInfo, err := p.handle.SnapshotSharedLister().NodeInfos().Get(nodeName)
if err != nil {
return 0, framework.NewStatus(framework.Error, fmt.Sprintf("getting node %q from Snapshot: %v", nodeName, err))
}
node := nodeInfo.Node()
if node == nil {
return 0, framework.NewStatus(framework.Error, "node not found")
}
// 获取 Node 指标数据
nodeMetric, err := p.nodeMetricLister.Get(nodeName)
if err != nil {
// caused by load-aware scheduling itself is an optimization,
// so we should skip the node and score the node 0
if errors.IsNotFound(err) {
return 0, nil
}
return 0, framework.NewStatus(framework.Error, err.Error())
}
// Node 数据获取失败,则不参与打分机制
if p.args.NodeMetricExpirationSeconds != nil && isNodeMetricExpired(nodeMetric, *p.args.NodeMetricExpirationSeconds) {
return 0, nil
}
// 生产环境 Pod 优先级
prodPod := extension.GetPodPriorityClassWithDefault(pod) == extension.PriorityProd && p.args.ScoreAccordingProdUsage
podMetrics := buildPodMetricMap(p.podLister, nodeMetric, prodPod)
// 估计 Pod 资源
estimatedUsed, err := p.estimator.EstimatePod(pod)
if err != nil {
return 0, nil
}
assignedPodEstimatedUsed, estimatedPods := p.estimatedAssignedPodUsed(nodeName, nodeMetric, podMetrics, prodPod)
for resourceName, value := range assignedPodEstimatedUsed {
estimatedUsed[resourceName] += value
}
// Pod 资源实际使用率
podActualUsages, estimatedPodActualUsages := sumPodUsages(podMetrics, estimatedPods)
if prodPod {
for resourceName, quantity := range podActualUsages {
estimatedUsed[resourceName] += getResourceValue(resourceName, quantity)
}
} else {
if nodeMetric.Status.NodeMetric != nil {
var nodeUsage *slov1alpha1.ResourceMap
if scoreWithAggregation(p.args.Aggregated) {
nodeUsage = getTargetAggregatedUsage(nodeMetric, &p.args.Aggregated.ScoreAggregatedDuration, p.args.Aggregated.ScoreAggregationType)
} else {
nodeUsage = &nodeMetric.Status.NodeMetric.NodeUsage
}
if nodeUsage != nil {
for resourceName, quantity := range nodeUsage.ResourceList {
if q := estimatedPodActualUsages[resourceName]; !q.IsZero() {
quantity = quantity.DeepCopy()
if quantity.Cmp(q) >= 0 {
quantity.Sub(q)
}
}
estimatedUsed[resourceName] += getResourceValue(resourceName, quantity)
}
}
}
}
// 可分配资源
allocatable, err := p.estimator.EstimateNode(node)
if err != nil {
return 0, nil
}
// 进行调度打分
// ResourceWeights indicates the weights of resources.
// The weights of resources are both 1 by default.
// ResourceWeights map[corev1.ResourceName]int64
// p.args.ResourceWeights 是来源于 SetDefaults_LoadAwareSchedulingArgs 函数初始化中的 defaultResourceWeights
// 就是 CPU、Memory 资源
score := loadAwareSchedulingScorer(p.args.ResourceWeights, estimatedUsed, allocatable)
return score, nil
}
// SetDefaults_LoadAwareSchedulingArgs sets the default parameters for LoadAwareScheduling plugin.
func SetDefaults_LoadAwareSchedulingArgs(obj *LoadAwareSchedulingArgs) {
if obj.FilterExpiredNodeMetrics == nil {
obj.FilterExpiredNodeMetrics = pointer.Bool(true)
}
if obj.NodeMetricExpirationSeconds == nil {
obj.NodeMetricExpirationSeconds = pointer.Int64(defaultNodeMetricExpirationSeconds)
}
if len(obj.ResourceWeights) == 0 {
obj.ResourceWeights = defaultResourceWeights
}
if len(obj.UsageThresholds) == 0 {
obj.UsageThresholds = defaultUsageThresholds
}
if obj.EstimatedScalingFactors == nil {
obj.EstimatedScalingFactors = defaultEstimatedScalingFactors
} else {
for k, v := range defaultEstimatedScalingFactors {
if _, ok := obj.EstimatedScalingFactors[k]; !ok {
obj.EstimatedScalingFactors[k] = v
}
}
}
}
var (
defaultNodeMetricExpirationSeconds int64 = 180
defaultResourceWeights = map[corev1.ResourceName]int64{
corev1.ResourceCPU: 1,
corev1.ResourceMemory: 1,
}
defaultUsageThresholds = map[corev1.ResourceName]int64{
corev1.ResourceCPU: 65, // 65%
corev1.ResourceMemory: 95, // 95%
}
defaultEstimatedScalingFactors = map[corev1.ResourceName]int64{
corev1.ResourceCPU: 85, // 85%
corev1.ResourceMemory: 70, // 70%
}
)
// 通过 weight 权重均衡 Pod 得分
func loadAwareSchedulingScorer(resToWeightMap, used map[corev1.ResourceName]int64, allocatable corev1.ResourceList) int64 {
var nodeScore, weightSum int64
for resourceName, weight := range resToWeightMap {
resourceScore := leastRequestedScore(used[resourceName], getResourceValue(resourceName, allocatable[resourceName]))
nodeScore += resourceScore * weight
weightSum += weight
}
return nodeScore / weightSum
}
// 最低得分
func leastRequestedScore(requested, capacity int64) int64 {
if capacity == 0 {
return 0
}
if requested > capacity {
return 0
}
return ((capacity - requested) * framework.MaxNodeScore) / capacity
}
Score 插件的主要功能是通过CPU、Memory资源使用率进行打分。
Reserve 与 Unreserve 拓展插件如下所示:
func (p *Plugin) Reserve(ctx context.Context, state *framework.CycleState, pod *corev1.Pod, nodeName string) *framework.Status {
p.podAssignCache.assign(nodeName, pod)
return nil
}
func (p *podAssignCache) assign(nodeName string, pod *corev1.Pod) {
if nodeName == "" || util.IsPodTerminated(pod) {
return
}
p.lock.Lock()
defer p.lock.Unlock()
m := p.podInfoItems[nodeName]
if m == nil {
m = make(map[types.UID]*podAssignInfo)
p.podInfoItems[nodeName] = m
}
m[pod.UID] = &podAssignInfo{
timestamp: timeNowFn(),
pod: pod,
}
}
func IsPodTerminated(pod *corev1.Pod) bool {
return pod.Status.Phase == corev1.PodSucceeded || pod.Status.Phase == corev1.PodFailed
}
func (p *Plugin) Unreserve(ctx context.Context, state *framework.CycleState, pod *corev1.Pod, nodeName string) {
p.podAssignCache.unAssign(nodeName, pod)
}
func (p *podAssignCache) unAssign(nodeName string, pod *corev1.Pod) {
if nodeName == "" {
return
}
p.lock.Lock()
defer p.lock.Unlock()
delete(p.podInfoItems[nodeName], pod.UID)
if len(p.podInfoItems[nodeName]) == 0 {
delete(p.podInfoItems, nodeName)
}
}
配置
apiVersion: v1
kind: ConfigMap
metadata:
name: koord-scheduler-config
...
data:
koord-scheduler-config: |
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: koord-scheduler
plugins:
# 启用LoadAwareScheduling插件
filter:
enabled:
- name: LoadAwareScheduling
...
score:
enabled:
- name: LoadAwareScheduling
weight: 1
...
reserve:
enabled:
- name: LoadAwareScheduling
...
pluginConfig:
# 配置插件的阈值和权重
- name: LoadAwareScheduling
args:
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: LoadAwareSchedulingArgs
# 是否过滤koordlet无法更新NodeMetric的节点
filterExpiredNodeMetrics: true
# 使用NodeMetric时的到期阈值单位秒
nodeMetricExpirationSeconds: 300
# 资源权重
resourceWeights:
cpu: 1
memory: 1
# 资源利用率的阈值(%)
usageThresholds:
cpu: 75
memory: 85
# Prod Pods资源利用率阈值(%)
prodUsageThresholds:
cpu: 55
memory: 65
# 根据Prod的使用情况启用分数
scoreAccordingProdUsage: true
# 估计资源使用情况的因素(%)
estimatedScalingFactors:
cpu: 80
memory: 70
# 根据百分位数统计实现资源利用率过滤和评分
aggregated:
usageThresholds:
cpu: 65
memory: 75
usageAggregationType: "p99"
scoreAggregationType: "p99"
| 字段 | 说明 |
|---|---|
| filterExpiredNodeMetrics | filterExpiredNodeMetrics 表示是否过滤koordlet更新NodeMetric失败的节点。 默认情况下启用,但在 Helm chart 中,它被禁用。 |
| nodeMetricExpirationSeconds | nodeMetricExpirationSeconds 指示 NodeMetric 过期时间,当 NodeMetrics 过期时,节点被认为是异常的。 默认为 180 秒。 |
| resourceWeights | resourceWeights 表示资源的权重。 CPU 和 Memory 的权重默认都是 1。 |
| usageThresholds | usageThresholds 表示整机的资源利用率阈值。 CPU 的默认值为 65%,内存的默认值为 95%。 |
| estimatedScalingFactors | estimatedScalingFactors 表示估计资源使用时的因子。 CPU 默认值为 85%,Memory 默认值为 70%。 |
| prodUsageThresholds | prodUsageThresholds 表示 Prod Pod 相对于整机的资源利用率阈值。 默认情况下不启用。 |
| scoreAccordingProdUsage | scoreAccordingProdUsage 控制是否根据 Prod Pod 的利用率进行评分。 |
| aggregated | aggregated 支持基于百分位数统计的资源利用率过滤和评分。 |
Aggregated 支持的字段:
| 字段 | 说明 |
|---|---|
| usageThresholds | usageThresholds 表示机器基于百分位统计的资源利用率阈值。 |
| usageAggregationType | usageAggregationType 表示过滤时机器利用率的百分位类型。 目前支持avg、p50、p90、p95和p99。 |
| usageAggregatedDuration | usageAggregatedDuration 表示过滤时机器利用率百分位数的统计周期。不设置该字段时,调度器默认使用 NodeMetrics 中最大周期的数据。 |
| scoreAggregationType | scoreAggregationType 表示评分时机器利用率的百分位类型。 目前支持avg、p50、p90、p95和p99。 |
| scoreAggregatedDuration | scoreAggregatedDuration 表示打分时 Prod Pod 利用率百分位的统计周期。 不设置该字段时,调度器默认使用 NodeMetrics 中最大周期的数据。 |
通过插件的配置可以作为集群默认的全局配置,用户也可以通过在节点上附加 annotation 来设置节点维度的负载阈值。
当节点上存在 annotation 时,会根据注解指定的参数进行过滤,Annotation 定义如下:
const (
AnnotationCustomUsageThresholds = "scheduling.koordinator.sh/usage-thresholds"
)
//CustomUsageThresholds支持用户定义的节点资源利用阈值。
type CustomUsageThresholds struct {
// 使用阈值表示整个机器的资源利用率阈值。
UsageThresholds map[corev1.ResourceName]int64 `json:"usageThresholds,omitempty"`
// ProdUsageThresholds表示与整台机器相比,Prod Pods的资源利用率阈值
ProdUsageThresholds map[corev1.ResourceName]int64 `json:"prodUsageThresholds,omitempty"`
// AggregatedUsage支持基于百分位数统计的资源利用率过滤和评分
AggregatedUsage *CustomAggregatedUsage `json:"aggregatedUsage,omitempty"`
}
type CustomAggregatedUsage struct {
// 使用阈值表示基于百分位数统计的机器资源利用阈值
UsageThresholds map[corev1.ResourceName]int64 `json:"usageThresholds,omitempty"`
// UsageAggregationType表示过滤时机器利用率的百分位数类型
UsageAggregationType slov1alpha1.AggregationType `json:"usageAggregationType,omitempty"`
// UsageAggregatedDuration表示过滤时机器利用率的百分位数的统计周期
UsageAggregatedDuration *metav1.Duration `json:"usageAggregatedDuration,omitempty"`
}
评论区