kube state metrics
common alert
covering almost all common problem scenarios:
Cluster Core Component Alerts (10)
AlertmanagerClusterCrashlooping, AlertmanagerClusterDown
AlertmanagerClusterFailedToSendAlerts, AlertmanagerConfigInconsistent
AlertmanagerFailedReload, KubeAPIDown
KubeAPIErrorBudgetBurn, PrometheusBadConfig
PrometheusTargetSyncFailure, PrometheusRuleFailures
Workload Related Alerts (12)
KubeDeploymentReplicasMismatch, KubeStatefulSetReplicasMismatch
KubePodCrashLooping, KubePodNotReady
KubeJobFailed, KubeHpaReplicasMismatch
KubeHpaMaxedOut, KubeHpaHighUtilization
KubeHpaFrequentScaling, KubeImagePullBackOff
KubeServiceEndpointsUnavailable, KubeConfigReloadFailed
Container and Resource Alerts (12)
KubeContainerOOMKilled
KubeContainerCPUNearLimit
KubeContainerMemoryNearLimit
KubeTooManyPendingPods
KubeCPUOvercommit
KubeMemoryOvercommit
KubeQuotaExceeded
KubeQuotaAlmostFull
CPUThrottlingHigh
NodeCPUHighUsage
NodeMemoryHighUtilization
NodeSystemSaturation
Node and Storage Alerts (10)
KubeletDown, KubeNodeNotReady
NodeFileDescriptorLimit, KubePersistentVolumeErrors
KubePersistentVolumeFillingUp, KubePersistentVolumeInodesFillingUp
KubeVolumeMountFailed, NodeFilesystemAlmostOutOfSpace
NodeFilesystemSpaceFillingUp, KubeStateMetricsDown
Certificate and Monitoring Alerts (5)
KubeClientCertificateExpiration
KubeletClientCertificateExpiration
KubeletServerCertificateExpiration
KubeStateMetricsListErrors
Watchdog
Last updated