Prometheus operator-摩杜云开发者社区

相对于一步一步安装配置Prometheus，还有更简单方式来监控报警，那就是 Prometheus Operator。Prometheus Operator 为监控 Kubernetes 资源和 Prometheus 实例的管理提供了简单的定义，简化在 Kubernetes 上部署、管理和运行 Prometheus 和 Alertmanager 集群。

Prometheus operator_prometheus

Prometheus Operator 为 Kubernetes 提供了对 Prometheus 机器相关监控组件的本地部署和管理方案，该项目的目的是为了简化和自动化基于 Prometheus 的监控栈配置，主要包括以下几个功能：

Kubernetes 自定义资源：使用 Kubernetes CRD 来部署和管理 Prometheus、Alertmanager 和相关组件。
简化的部署配置：直接通过 Kubernetes 资源清单配置 Prometheus，比如版本、持久化、副本、保留策略等等配置。
Prometheus 监控目标配置：基于熟知的 Kubernetes 标签查询自动生成监控目标配置，无需学习 Prometheus 特地的配置。

Prometheus

Prometheus 是一个开源系统监控和警报工具包。

Prometheus 将其指标收集并存储为时间序列数据，即指标信息与记录时的时间戳以及称为标签的可选键值对一起存储。

特征

普罗米修斯的主要特点是：

通过服务发现或静态配置发现目标
时间序列收集通过 HTTP 上的拉模型进行
PromQL，是一种利用这种维度的灵活查询语言
通过中间网关支持推送时间序列
不依赖分布式存储；单个服务器节点是自治的
具有由度量名称和键/值对标识的时间序列数据的多维数据模型
多种图形模式和仪表板支持

架构以及生态组件

Prometheus operator_prometheus_02

安装 Prometheus Operator

要求：

对于 Kubernetes v1.20.z 之前的版本，请参阅Kubernetes 兼容性矩阵以选择兼容的分支。

从 GitHub 克隆 kube-prometheus。

git clone https://github.com/prometheus-operator/kube-prometheus.git

创建命名空间和CRD

kubectl create -f manifests/setup

创建全部

kubectl create -f manifests/

Prometheus Operator 架构图

Prometheus operator_prometheus_03

上图是 Prometheus-Operator 官方提供的架构图，各组件以不同的方式运行在 Kubernetes 集群中，其中 Operator 是最核心的部分，作为一个控制器，他会去创建 Prometheus、ServiceMonitor、AlertManager 以及 PrometheusRule 等 CRD 资源对象，然后会一直 Watch 并维持这些资源对象的状态。

在最新版本的 Operator 中提供了一下几个 CRD 资源对象：

Prometheus
Alertmanager
ServiceMonitor
PodMonitor
Probe
ThanosRuler
PrometheusRule
AlertmanagerConfig

Prometheus

该 CRD 声明定义了 Prometheus 期望在 Kubernetes 集群中运行的配置，提供了配置选项来配置副本、持久化、报警实例等。

对于每个 Prometheus CRD 资源，Operator 都会以 StatefulSet 形式在相同的命名空间下部署对应配置的资源，Prometheus Pod 的配置是通过一个包含 Prometheus 配置的名为 <prometheus-name> 的 Secret 对象声明挂载的。

该 CRD 根据标签选择来指定部署的 Prometheus 实例应该覆盖哪些 ServiceMonitors，然后 Operator 会根据包含的 ServiceMonitors 生成配置，并在包含配置的 Secret 中进行更新。

如果未提供对 ServiceMonitor 的选择，则 Operator 会将 Secret 的管理留给用户，这样就可以提供自定义配置，同时还能享受 Operator 管理 Operator 的设置能力。

Alertmanager

该 CRD 定义了在 Kubernetes 集群中运行的 Alertmanager 的配置，同样提供了多种配置，包括持久化存储。

对于每个 Alertmanager 资源，Operator 都会在相同的命名空间中部署一个对应配置的 StatefulSet，Alertmanager Pods 被配置为包含一个名为 <alertmanager-name> 的 Secret，该 Secret 以 alertmanager.yaml 为 key 的方式保存使用的配置文件。

当有两个或更多配置的副本时，Operator 会在高可用模式下运行 Alertmanager 实例。

ThanosRuler

该 CRD 定义了一个 Thanos Ruler 组件的配置，以方便在 Kubernetes 集群中运行。通过 Thanos Ruler，可以跨多个 Prometheus 实例处理记录和警报规则。

一个 ThanosRuler 实例至少需要一个 queryEndpoint，它指向 Thanos Queriers 或 Prometheus 实例的位置。queryEndpoints 用于配置 Thanos 运行时的 --query 参数。

ServiceMonitor

该 CRD 定义了如何监控一组动态的服务，使用标签选择来定义哪些 Service 被选择进行监控。

为了让 Prometheus 监控 Kubernetes 内的任何应用，需要存在一个 Endpoints 对象，Endpoints 对象本质上是 IP 地址的列表，通常 Endpoints 对象是由 Service 对象来自动填充的，Service 对象通过标签选择器匹配 Pod，并将其添加到 Endpoints 对象中。一个 Service 可以暴露一个或多个端口，这些端口由多个 Endpoints 列表支持，这些端点一般情况下都是指向一个 Pod。

Prometheus Operator 引入的这个 ServiceMonitor 对象就会发现这些 Endpoints 对象，并配置 Prometheus 监控这些 Pod。ServiceMonitorSpec 的 endpoints 部分就是用于配置这些 Endpoints 的哪些端口将被 scrape 指标的。

注意：endpoints（小写）是 ServiceMonitor CRD 中的字段，而 Endpoints（大写）是 Kubernetes 的一种对象。

ServiceMonitors 以及被发现的目标都可以来自任何命名空间，这对于允许跨命名空间监控的场景非常重要。使用 PrometheusSpec 的 ServiceMonitorNamespaceSelector，可以限制各自的 Prometheus 服务器选择的 ServiceMonitors 的命名空间。使用 ServiceMonitorSpec 的 namespaceSelector，可以限制 Endpoints 对象被允许从哪些命名空间中发现，要在所有命名空间中发现目标，namespaceSelector 必须为空：

spec:
  namespaceSelector:
    any: true

带来的另一个问题就是如何区分指标来自哪个ns，通过配置动态标签来实现

endpoints:
    - port: your-metrics-port  # 指定服务的指标端口
      path: /metrics  # 指定服务的指标路径
      relabelings:
        - targetLabel: namespace  # 指定目标标签为"namespace"
          replacement: "$1"  # 使用正则表达式提取命名空间信息
          action: replace
          sourceLabels:
            - namespace

PodMonitor

该 CRD 用于定义如何监控一组动态 pods，使用标签选择来定义哪些 pods 被选择进行监控。同样团队中可以制定一些规范来暴露监控的指标。

Pod 是一个或多个容器的集合，可以在一些端口上暴露 Prometheus 指标。

由 Prometheus Operator 引入的 PodMonitor 对象会发现这些 Pod，并为 Prometheus 服务器生成相关配置，以便监控它们。

PodMonitorSpec 中的 PodMetricsEndpoints 部分，用于配置 Pod 的哪些端口将被 scrape 指标，以及使用哪些参数。

PodMonitors 和发现的目标可以来自任何命名空间，这同样对于允许跨命名空间的监控用例是很重要的。使用 PodMonitorSpec 的 namespaceSelector，可以限制 Pod 被允许发现的命名空间，要在所有命名空间中发现目标，namespaceSelector 必须为空：

spec:
  namespaceSelector:
    any: true

PodMonitor 和 ServieMonitor 最大的区别就是不需要有对应的 Service。

Probe

该 CRD 用于定义如何监控一组 Ingress 和静态目标。除了 target 之外，Probe 对象还需要一个 prober，它是监控的目标并为 Prometheus 提供指标的服务。例如可以通过使用 blackbox-exporter 来提供这个服务。

PrometheusRule

用于配置 Prometheus 的 Rule 规则文件，包括 recording rules 和 alerting，可以自动被 Prometheus 加载。

AlertmanagerConfig

在以前的版本中要配置 Alertmanager 都是通过 Configmap 来完成的，在 v0.43 版本后新增该 CRD，可以将 Alertmanager 的配置分割成不同的子对象进行配置，允许将报警路由到自定义 Receiver 上，并配置抑制规则。

AlertmanagerConfig 可以在命名空间级别上定义，为 Alertmanager 提供一个聚合的配置。这里提供了一个如何使用它的例子。不过需要注意这个 CRD 还不稳定。

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: config-example
  labels:
    alertmanagerConfig: example
spec:
  route:
    groupBy: ['job']
    groupWait: 30s
    groupInterval: 5m
    repeatInterval: 12h
    receiver: 'wechat-example'
  receivers:
  - name: 'wechat-example'
    wechatConfigs:
    - apiURL: 'http://wechatserver:8080/'
      corpID: 'wechat-corpid'
      apiSecret:
        name: 'wechat-config'
        key: 'apiSecret'

---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: wechat-config
data:
  apiSecret: d2VjaGF0LXNlY3JldAo=

自定义监控例子

添加监控服务

除了 Kubernetes 集群中的一些资源对象、节点以及组件需要监控，有的时候我们可能还需要根据实际的业务需求去添加自定义的监控项，添加一个自定义监控的步骤也是非常简单的。

第一步建立一个 ServiceMonitor 对象，用于 Prometheus 添加监控项
第二步为 ServiceMonitor 对象关联 metrics 数据接口的一个 Service 对象
第三步确保 Service 对象可以正确获取到 metrics 数据

获取 metrics，监控 mysql 为例

以 mysql_exporter 为例。

使用 mysql_exporter 为mysql提供 metrics 接口。（官方提供一些开源服务的exporter ）

mysql-exporter-prometheus.uisee.svc.cluster.local:9104

创建 ServiceMonitor 对象

被监控服务的SVC为：

apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: mysql-exporter
    meta.helm.sh/release-namespace: uisee
  labels:
    app.kubernetes.io/instance: mysql-exporter # 通过标签关联
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/version: v0.14.0
    helm.sh/chart: prometheus-mysql-exporter-1.8.1
  name: mysql-exporter-prometheus
  namespace: uisee
spec:
  ports:
  - name: mysql-exporter # 端口的名字
    port: 9104
    protocol: TCP
    targetPort: 9104
  selector:
    app.kubernetes.io/instance: mysql-exporter
    app.kubernetes.io/name: prometheus-mysql-exporter
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

ServiceMonitor资源：

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: mysql-k8s      
  namespace: monitoring    
  labels:
    k8s-app: mysql-k8s    
spec:
  endpoints:
    - port: mysql-exporter    # 与监控服务SVC的端口号对应
      interval: 15s           # 获取指标的频率
  selector:
    matchLabels:
      app.kubernetes.io/instance: mysql-exporter    # 通过标签选择SVC
  jobLabel: k8s-app           # 用于从中检索任务名称的标签
  namespaceSelector:
    matchNames:               # 指定目标所在的命名空间，也可以设置
      - uisee

如果报错无法列出命名空间 uisee 无法列出资源的错误，请检查RBAC。

过段时间查看Prometheus Dashboard 的 targets

Prometheus operator_prometheus_04

添加告警规则

Prometheus Dashboard 的 Alert 页面下面就已经有很多报警规则了，这一系列的规则其实都来自于项目 https://github.com/kubernetes-monitoring/kubernetes-mixin，都通过 Prometheus Operator 安装配置上了。

创建监控Node节点内存使用情况的监控

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  generation: 1
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-prometheus
    app.kubernetes.io/part-of: kube-prometheus
    prometheus: k8s
    role: alert-rules
  name: kube-prometheus-rules-node
  namespace: monitoring
spec:
  groups:
  - name: node.rules  # 给一批规则分组
    rules:
    - alert: NodeMemory   # 规则的名字
      annotations:    # description 描述  runbook_url 操作手册  summary 概括
        description: 'node 内存的使用情况'
        runbook_url: https://runbooks.prometheus-operator.dev/runbooks/general/targetdown
        summary: One or more targets are unreachable.
      expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100 > 50 # 告警表达式
      for: 2m  # 等待2分钟
      labels:
        severity: warning # 定义告警标签。跟告警

apply上面资源，稍等片刻，规则会展示出来。

Prometheus operator_prometheus_05

添加告警通知

首先在 Alertmanager 的页面上 status 路径下面查看 AlertManager 的配置信息:

Prometheus operator_prometheus_06

Prometheus operator_prometheus_07

这些配置信息实际上是来自于 Prometheus-Operator 自动创建的名为 alertmanager-main-generated 的 Secret 对象：

$ kubectl get secret alertmanager-main-generated -n monitoring -o json | jq -r '.data."alertmanager.yaml"' | base64 --decode
"global":
  "resolve_timeout": "5m"  
"inhibit_rules":
- "equal":
  - "namespace"
  - "alertname"
  "source_matchers":
  - "severity = critical"
  "target_matchers":
  - "severity =~ warning|info"
- "equal":
  - "namespace"
  - "alertname"
  "source_matchers":
  - "severity = warning"
  "target_matchers":
  - "severity = info"
"receivers":
- "name": "Default"
- "name": "Watchdog"
- "name": "Critical"
"route":
  "group_by":
  - "namespace"
  "group_interval": "5m"
  "group_wait": "30s"
  "receiver": "Default"
  "repeat_interval": "12h"
  "routes":
  - "matchers":
    - "alertname = Watchdog"
    "receiver": "Watchdog"
  - "matchers":
    - "severity = critical"
    "receiver": "Critical"%

添加自己的接收器。

Prometheus-Operator 新增了一个 AlertmanagerConfig 的 CRD，比如将 Critical 这个接收器的报警信息都发送到飞书进行报警。

# alertmanager-config.yaml
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: feishu-hook
  namespace: monitoring
  labels:
    alertmanagerConfig: example
spec:
  receivers:
    - name: Critical
      webhookConfigs:
        - url: http://<webhook> 
          sendResolved: true
  route:
    groupBy: ["namespace"]
    groupWait: 30s
    groupInterval: 5m
    repeatInterval: 12h
    receiver: Critical
    routes:
      - receiver: Critical
        match:
          severity: critical

在 Alertmanager 的资源对象中创建标签来关联上面的这个对象，比如这里新增了一个 Label 标签：alertmanagerConfig: example。

# alertmanager-alertmanager.yaml
apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
  labels:
    alertmanager: main
  name: main
  namespace: monitoring
spec:
  replicas: 3
  alertmanagerConfigSelector: # 匹配 AlertmanagerConfig 的标签
    matchLabels:
      alertmanagerConfig: example
...

更新资源：

kubectl apply -f alertmanager-config.yaml
kubectl apply -f alertmanager-alertmanager.yaml

告警拓扑图

Prometheus operator_prometheus_08

Prometheus分布式架构

Thanos 官网

Prometheus operator_prometheus_09