Kubernetes系列之kubernetes Prometheus Operator

Kubernetes系列之kubernetes Prometheus Operator

Operator是由CoreOS公司開發的用來擴展Kubernetes API的特定應用程序控制器,用來創建、配置和管理複雜的有狀態應用,例如Mysql、緩存和監控系統。目前CoreOS官方提供了幾種Operator的代碼實現,其中就包括Prometheus Operator

**下圖為Prometheus Operator 架構圖**

Operator作為一個核心的控制器,它會創建Prometheus、ServiceMonitor、alertmanager以及我們的prometheus-rule這四個資源對象,operator會一直監控並維持這四個資源對象的狀態,其中創建Prometheus資源對象就是作為Prometheus Server進行監控,而ServiceMonitor就是我們用的exporter的各種抽象(exporter前面文章已經介紹了,就是提供我們各種服務的metrics的工具)Prometheus就是通過ServiceMonitor提供的metrics數據接口把我們數據pull過來的。現在我們監控prometheus不需要每個服務單獨創建修改規則。通過直接管理Operator來進行集群的監控。這裡還要說一下,一個ServiceMonitor可以通過我們的label標籤去匹配集群內部的service,而我們的prometheus也可以通過label匹配多個ServiceMonitor

![prometheus-operator.png-147.6kB](https://upload-images.jianshu.io/upload_images/6064401-2df45e5d404b875f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

其中,Operator是核心部分,作為一個控制器而存在,Operator會創建Prometheus、ServiceMonitor、AlertManager及PrometheusRule這4個CRD資源對象,然後一直監控並維持這4個CRD資源對象的狀態

* Prometheus 資源對象是作為Prometheus Service存在的

* ServiceMonitor 資源對象是專門提供metrics數據接口的exporter的抽象,Prometheus就是通過ServiceMonitor提供的metrics數據接口去 pull 數據的

* AlerManager 資源對象是對應alertmanager組件

* PrometheusRule 資源對象是被Prometheus實例使用的告警規則文件

CRD簡介

全稱CustomResourceDefinition,在Kubernetes中一切都可視為資源,在Kubernetes1.7之後增加對CRD自定義資源二次開發能力開擴展Kubernetes API,當我們創建一個新的CRD時,Kubernetes API服務器將為你制定的每個版本創建一個新的RESTful資源路徑,我們可以根據該API路徑來創建一些我們自己定義的類型資源。CRD可以是命名空間,也可以是集群範圍。由CRD的作用域scpoe字段中所制定的,與現有的內置對象一樣,刪除名稱空間將刪除該名稱中的所有自定義對象

簡單的來說CRD是對Kubernetes API的擴展,Kubernetes中的每個資源都是一個API對象的集合,例如yaml文件中定義spec那樣,都是對Kubernetes中資源對象的定義,所有的自定義資源可以跟Kubernetes中內建的資源一樣使用Kubectl

這樣,在集群中監控數據,就變成Kubernetes直接去監控資源對象,Service和ServiceMonitor都是Kubernetes的資源對象,一個ServiceMonitor可以通過labelSelector匹配一類Service,Prometheus也可以通過labelSelector匹配多個ServiceMonitor,並且Prometheus和AlertManager都是自動感知監控告警配置的變化,不需要認為進行reload操作。

* * *

## 安裝

Operator是原生支持Prometheus的,可以通過服務發現來監控集群,並且是通用安裝。也就是operator提供的yaml文件,基本上在Prometheus是可以直接使用的,需要改動的地方可能就只有幾處

```

#官方下載 (使用官方下載的出現鏡像版本不相同請自己找鏡像版本)

wget -P /root/ https://github.com/coreos/kube-prometheus/archive/master.zip

unzip master.zip

cd /root/kube-prometheus-master/manifests

```

prometheus-serviceMonitorKubelet.yaml (這個文件是用來收集我們service的metrics數據的)

> 不需要修改

```

cat prometheus-serviceMonitorKubelet.yaml

apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

labels:

k8s-app: kubelet

name: kubelet

namespace: monitoring

spec:

endpoints:

- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token

honorLabels: true

interval: 30s

port: https-metrics

scheme: https

tlsConfig:

insecureSkipVerify: true

- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token

honorLabels: true

interval: 30s

metricRelabelings:

- action: drop

regex: container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)

sourceLabels:

- __name__

path: /metrics/cadvisor

port: https-metrics

scheme: https

tlsConfig:

insecureSkipVerify: true

jobLabel: k8s-app

namespaceSelector: #匹配命名空間,這個代表的意思就是會去匹配kube-system命名空間下,具有k8s-app=kubelet的標籤,會將匹配的標籤納入我們prometheus監控中

matchNames:

- kube-system

selector: #這三行是用來匹配我們的service

matchLabels:

k8s-app: kubelet

```

這裡修改完畢後,我們就可以直接創建配置文件

```

[root@HUOBAN-K8S-MASTER01 manifests]# kubectl apply -f ./

namespace/monitoring unchanged

customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com unchanged

customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com unchanged

customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com unchanged

customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com unchanged

customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com unchanged

clusterrole.rbac.authorization.k8s.io/prometheus-operator unchanged

clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator unchanged

deployment.apps/prometheus-operator unchanged

service/prometheus-operator unchanged

serviceaccount/prometheus-operator unchanged

servicemonitor.monitoring.coreos.com/prometheus-operator created

alertmanager.monitoring.coreos.com/main created

secret/alertmanager-main unchanged

service/alertmanager-main unchanged

serviceaccount/alertmanager-main unchanged

servicemonitor.monitoring.coreos.com/alertmanager created

secret/grafana-datasources unchanged

configmap/grafana-dashboard-apiserver unchanged

configmap/grafana-dashboard-controller-manager unchanged

configmap/grafana-dashboard-k8s-resources-cluster unchanged

configmap/grafana-dashboard-k8s-resources-namespace unchanged

configmap/grafana-dashboard-k8s-resources-node unchanged

configmap/grafana-dashboard-k8s-resources-pod unchanged

configmap/grafana-dashboard-k8s-resources-workload unchanged

configmap/grafana-dashboard-k8s-resources-workloads-namespace unchanged

configmap/grafana-dashboard-kubelet unchanged

configmap/grafana-dashboard-node-cluster-rsrc-use unchanged

configmap/grafana-dashboard-node-rsrc-use unchanged

configmap/grafana-dashboard-nodes unchanged

configmap/grafana-dashboard-persistentvolumesusage unchanged

configmap/grafana-dashboard-pods unchanged

configmap/grafana-dashboard-prometheus-remote-write unchanged

configmap/grafana-dashboard-prometheus unchanged

configmap/grafana-dashboard-proxy unchanged

configmap/grafana-dashboard-scheduler unchanged

configmap/grafana-dashboard-statefulset unchanged

configmap/grafana-dashboards unchanged

deployment.apps/grafana configured

service/grafana unchanged

serviceaccount/grafana unchanged

servicemonitor.monitoring.coreos.com/grafana created

clusterrole.rbac.authorization.k8s.io/kube-state-metrics unchanged

clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics unchanged

deployment.apps/kube-state-metrics unchanged

role.rbac.authorization.k8s.io/kube-state-metrics unchanged

rolebinding.rbac.authorization.k8s.io/kube-state-metrics unchanged

service/kube-state-metrics unchanged

serviceaccount/kube-state-metrics unchanged

servicemonitor.monitoring.coreos.com/kube-state-metrics created

clusterrole.rbac.authorization.k8s.io/node-exporter unchanged

clusterrolebinding.rbac.authorization.k8s.io/node-exporter unchanged

daemonset.apps/node-exporter configured

service/node-exporter unchanged

serviceaccount/node-exporter unchanged

servicemonitor.monitoring.coreos.com/node-exporter created

apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged

clusterrole.rbac.authorization.k8s.io/prometheus-adapter unchanged

clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged

clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter unchanged

clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator unchanged

clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources unchanged

configmap/adapter-config unchanged

deployment.apps/prometheus-adapter configured

rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader unchanged

service/prometheus-adapter unchanged

serviceaccount/prometheus-adapter unchanged

clusterrole.rbac.authorization.k8s.io/prometheus-k8s unchanged

clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s unchanged

prometheus.monitoring.coreos.com/k8s created

rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config unchanged

rolebinding.rbac.authorization.k8s.io/prometheus-k8s unchanged

rolebinding.rbac.authorization.k8s.io/prometheus-k8s unchanged

rolebinding.rbac.authorization.k8s.io/prometheus-k8s unchanged

role.rbac.authorization.k8s.io/prometheus-k8s-config unchanged

role.rbac.authorization.k8s.io/prometheus-k8s unchanged

role.rbac.authorization.k8s.io/prometheus-k8s unchanged

role.rbac.authorization.k8s.io/prometheus-k8s unchanged

prometheusrule.monitoring.coreos.com/prometheus-k8s-rules created

service/prometheus-k8s unchanged

serviceaccount/prometheus-k8s unchanged

servicemonitor.monitoring.coreos.com/prometheus created

servicemonitor.monitoring.coreos.com/kube-apiserver created

servicemonitor.monitoring.coreos.com/coredns created

servicemonitor.monitoring.coreos.com/kube-controller-manager created

servicemonitor.monitoring.coreos.com/kube-scheduler created

servicemonitor.monitoring.coreos.com/kubelet created

```

當我們部署成功之後,我們可以查看一下crd,yaml文件會自動幫我們創建crd文件。只有我們創建了crd文件,我們的serviceMonitor才會有用

```

[root@HUOBAN-K8S-MASTER01 manifests]# kubectl get crd

NAME CREATED AT

alertmanagers.monitoring.coreos.com 2019-10-18T08:32:57Z

podmonitors.monitoring.coreos.com 2019-10-18T08:32:58Z

prometheuses.monitoring.coreos.com 2019-10-18T08:32:58Z

prometheusrules.monitoring.coreos.com 2019-10-18T08:32:58Z

servicemonitors.monitoring.coreos.com 2019-10-18T08:32:59Z

```

其他的資源文件都會部署在一個命名空間下面,在monitoring裡面是operator Pod對應的列表

```

[root@HUOBAN-K8S-MASTER01 manifests]# kubectl get pod -n monitoring

NAME READY STATUS RESTARTS AGE

alertmanager-main-0 2/2 Running 0 11m

alertmanager-main-1 2/2 Running 0 11m

alertmanager-main-2 2/2 Running 0 11m

grafana-55488b566f-g2sm9 1/1 Running 0 11m

kube-state-metrics-ff5cb7949-wq7pb 3/3 Running 0 11m

node-exporter-6wb5v 2/2 Running 0 11m

node-exporter-785rf 2/2 Running 0 11m

node-exporter-7kvkp 2/2 Running 0 11m

node-exporter-85bnh 2/2 Running 0 11m

node-exporter-9vxwf 2/2 Running 0 11m

node-exporter-bvf4r 2/2 Running 0 11m

node-exporter-j6d2d 2/2 Running 0 11m

prometheus-adapter-668748ddbd-d8k7f 1/1 Running 0 11m

prometheus-k8s-0 3/3 Running 1 11m

prometheus-k8s-1 3/3 Running 1 11m

prometheus-operator-55b978b89-qpzfk 1/1 Running 0 11m

```

其中prometheus和alertmanager採用的StatefulSet,其他的Pod則採用deployment創建

```

[root@HUOBAN-K8S-MASTER01 manifests]# kubectl get deployments.apps -n monitoring

NAME READY UP-TO-DATE AVAILABLE AGE

grafana 1/1 1 1 12m

kube-state-metrics 1/1 1 1 12m

prometheus-adapter 1/1 1 1 12m

prometheus-operator 1/1 1 1 12m

[root@HUOBAN-K8S-MASTER01 manifests]# kubectl get statefulsets.apps -n monitoring

NAME READY AGE

alertmanager-main 3/3 11m

prometheus-k8s 2/2 11m

#其中prometheus-operator是我們的核心文件,它是監控我們prometheus和alertmanager的文件

```

現在創建完成後我們還無法直接訪問prometheus

```

[root@HUOBAN-K8S-MASTER01 manifests]# kubectl get svc -n monitoring |egrep "prometheus|grafana|alertmanage"

alertmanager-main ClusterIP 10.96.226.38 <none> 9093/TCP 3m55s/<none>

alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 3m10s/<none>

grafana ClusterIP 10.97.175.234 <none> 3000/TCP 3m53s/<none>

prometheus-adapter ClusterIP 10.96.43.155 <none> 443/TCP 3m53s/<none>

prometheus-k8s ClusterIP 10.105.75.186 <none> 9090/TCP 3m52s/<none>

prometheus-operated ClusterIP None <none> 9090/TCP 3m/<none>

prometheus-operator ClusterIP None <none> 8080/TCP 3m55s/<none>

```

由於默認的yaml文件svc採用的是ClusterIP,我們無法進行訪問。這裡我們可以使用ingress進行代理,或者使用node-port臨時訪問。我這裡就修改一下svc,使用node-port進行訪問

```

#我這裡使用edit進行修改,或者修改yaml文件apply下即可

kubectl edit svc -n monitoring prometheus-k8s

#注意修改的svc是prometheus-k8s因為這個有clusterIP

kubectl edit svc -n monitoring grafana

kubectl edit svc -n monitoring alertmanager-main

#三個文件都需要修改,不要修改錯了。都是修改有clusterIP的

...

type: NodePort #將這行修改為NodePort

```

prometheus-k8s、grafana和alertmanager-main都是隻修改type=clusterIP這行

![image.png](https://upload-images.jianshu.io/upload_images/6064401-efe1ddb97a7d8c65.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

修改完畢後,我們在查看svc,就會發現這幾個都包含node端口了,接下來在任意集群節點訪問即可

```

[root@HUOBAN-K8S-MASTER01 manifests]# kubectl get svc -n monitoring |egrep "prometheus|grafana|alertmanage"

alertmanager-main NodePort 10.96.226.38 <none> 9093:32477/TCP 13m/<none>

alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 12m/<none>

grafana NodePort 10.97.175.234 <none> 3000:32474/TCP 13m/<none>

prometheus-adapter ClusterIP 10.96.43.155 <none> 443/TCP 13m/<none>

prometheus-k8s NodePort 10.105.75.186 <none> 9090:32489/TCP 13m/<none>

prometheus-operated ClusterIP None <none> 9090/TCP 12m/<none>

prometheus-operator ClusterIP None <none> 8080/TCP 13m/<none>

```

接下來我們查看prometheus的Ui界面

```

[root@HUOBAN-K8S-MASTER01 manifests]# kubectl get svc -n monitoring |grep prometheus-k8s

prometheus-k8s NodePort 10.105.75.186 <none> 9090:32489/TCP 19m/<none>

[root@HUOBAN-K8S-MASTER01 manifests]# hostname -i

172.16.17.191

```

我們訪問的集群172.16.17.191:32489

![image.png](https://upload-images.jianshu.io/upload_images/6064401-90275be27c1d9f95.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

這裡kube-controller-manager和kube-scheduler並管理的目標,其他的都有。這裡的就是和官方yaml文件裡面定義的有關係

![image.png](https://upload-images.jianshu.io/upload_images/6064401-80b0213e740d6398.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

配置文件解釋

```

# vim prometheus-serviceMonitorKubeScheduler.yaml

apiVersion: monitoring.coreos.com/v1 #kubectl get crd裡面包含的,不進行修改

kind: ServiceMonitor

metadata:

labels:

k8s-app: kube-scheduler

name: kube-scheduler #定義的名稱

namespace: monitoring

spec:

endpoints:

- interval: 30s

port: http-metrics #這裡定義的就是在svc上的端口名稱

jobLabel: k8s-app

namespaceSelector: #表示匹配哪一個命名空間,配置any:true則回去所有命名空間中查詢

matchNames:

- kube-system

selector: #這裡大概意思就是匹配kube-system命名空間下具有k8s-app=kube-scheduler標籤的svc

matchLabels:

k8s-app: kube-scheduler

```


分享到:


相關文章: