Prometueus 监控
公网服务器
frontend
backend
monitoring
grafana: :3000
prometheus: 9090
pushgateway: 0.0.0.0:9091
node-exporter: :9100
alertmanager: :9093
内网服务器
frontend
backend
node-exporer: :9100
node-exporter-pusher:
prometueus
创建 TLS 证书
Prometheus Server And TLS: https://o11y.eu/blog/prometheus-server-tls/
basic auth and tls: https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-config.yml
cd /data/prometheus/conf
openssl req -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout prometheus.key -out prometheus.crt -subj "/C=CN/ST=Beijing/L=Beijing/O=exampleOrg/CN=prometheus" -addext "subjectAltName = DNS:localhost"
web-config.yml
# TLS and basic authentication configuration example.
#
# Additionally, a certificate and a key file are needed.
tls_server_config:
cert_file: prometheus.crt
key_file: prometheus.key
# Usernames and passwords required to connect.
# Passwords are hashed with bcrypt: https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-configuration.md#about-bcrypt.
basic_auth_users:
prometheus:
http basic auth 密码创建: https://o11y.tools/pwgen/
prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
- "alert.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
scheme: https
tls_config:
ca_file: prometheus.crt
basic_auth:
username: prometheus
password: ""
static_configs:
- targets: ["localhost:9090"]
- job_name: node
static_configs:
- targets: ['node-exporter:9100']
- job_name: "pushgateway"
scheme: https
basic_auth:
username: prometheus
password: ""
tls_config:
ca_file: pushgateway.crt
static_configs:
- targets: ['pushgateway:9091']
alert.yml
groups:
- name: Instances
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
labels:
severity: page
# Prometheus templates apply here in the annotation and label fields of the alert.
annotations:
description: ' of job has been down for more than 1 minutes.'
summary: 'Instance down'
创建网络
docker network create monitoring
启动 prometheus
docker run --name prometheus -d --restart=always \
--user "$(id -u)" \
-v /data/prometheus/config/prometheus.yml:/etc/prometheus/prometheus.yml \
-v /data/prometheus/config/web-config.yml:/etc/prometheus/web-config.yml \
-v /data/prometheus/config/prometheus.crt:/etc/prometheus/prometheus.crt \
-v /data/prometheus/config/prometheus.key:/etc/prometheus/prometheus.key \
-v /data/prometheus/config/pushgateway.crt:/etc/prometheus/pushgateway.crt \
-v /data/prometheus/config/alert.yml:/etc/prometheus/alert.yml \
-v /data/prometheus/data:/prometheus \
--net monitoring \
prom/prometheus:v3.1.0 --config.file=/etc/prometheus/prometheus.yml --web.config.file=/etc/prometheus/web-config.yml
–user 1000
open /prometheus/queries.active: permission denied
https://github.com/prometheus/prometheus/issues/5976
grafana
docker run -d --name=grafana --restart=always \
--user "$(id -u)" \
-v /data/grafana/data:/var/lib/grafana \
--net monitoring \
grafana/grafana:11.4.0
docker network connect frontend grafana
node-exporter
https://prometheus.io/docs/guides/cadvisor/
docker run -d --name node-exporter --restart=always \
-v "/proc:/host/proc:ro" \
-v "/sys:/host/sys:ro" \
-v "/:/rootfs:ro" \
--net monitoring \
prom/node-exporter:v1.9.0
push-gateway
pushgateway: https://github.com/prometheus/pushgateway
创建 TLS 证书
cd /data/pushgateway/conf/
openssl req -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout pushgateway.key -out pushgateway.crt -subj "/C=CN/ST=Beijing/L=Beijing/O=exampleOrg/CN=pushgateway" -addext "subjectAltName = DNS:pushgateway"
cp pushgateway.crt /data/prometheus/conf
web-config.yaml
# TLS and basic authentication configuration example.
#
# Additionally, a certificate and a key file are needed.
tls_server_config:
cert_file: pushgateway.crt
key_file: pushgateway.key
# Usernames and passwords required to connect.
# Passwords are hashed with bcrypt: https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-configuration.md#about-bcrypt.
basic_auth_users:
prometheus:
http:
启动 pushgateway
docker run --name pushgateway -d --restart=always \
--user "$(id -u)" \
-p 9091:9091 \
-v /data/pushgateway/config:/pushgateway/config \
--network monitoring \
prom/pushgateway:v1.11.0 --web.config.file=/pushgateway/config/web-config.yaml
发送 metrics
echo "my_metric 2" | gzip | curl --insecure -u username:password -H 'Content-Encoding: gzip' --data-binary @- https://pushgateway:9091/metrics/job/test-job/instance/nodename
curl --insecure -u http:password -X DELETE https://pushgateway:9091/metrics/job/test-job/instance/nodename
PushProx
通过 PushProx 抓取内网的 node-exporter
https://github.com/prometheus-community/PushProx
prometheus-community-PushProx介绍:https://blog.csdn.net/doyzfly/article/details/120752044
docker pull prometheuscommunity/pushprox:master
server
docker run --name pushprox-proxy -d --restart=always \
--network monitoring \
prometheuscommunity/pushprox:v0.2.0
docker network connect frontend pushprox-proxy
curl {pushprox-proxy}:8080/metrics
创建自签名证书
配置 nginx
client
docker run --name pushprox-client -d --restart=always \
--entrypoint /app/pushprox-client \
--network backend \
--add-host pushprox.example.com:39.100.100.100 \
-v /home/debian/data/pushprox-client/certs/:/app/certs/ \
prometheuscommunity/pushprox:v0.2.0 \
--fqdn=node-exporter
--proxy-url=http://pushprox.example.com/ \
--tls.cacert=/app/certs/ca.crt \
--tls.cert=/app/certs/client.crt \
--tls.key=/app/certs/client.key
node-exporter-pusher
推荐使用 PushProx 抓取
将内网的 node-exporter 发送到 pushgateway
docker run --name node-exporter-pusher -d --restart=always \
-v /home/debian/data/node-exporter-pusher/.env:/root/.env \
--network backend \
node-exporter-pusher
alertmanager
docker run --name alertmanager -d --restart=always \
--network monitoring \
-v /data/alertmanager/config:/alertmanager/config \
prom/alertmanager:v0.27.0 --config.file=/etc/alertmanager/alertmanager.yml
References
Prometheus+Grafana监控MySQL_ITPUB博客:https://blog.itpub.net/69982604/viewspace-2743207/
Introduction | prometheus-book:https://yunlzheng.gitbook.io/prometheus-book |
Kubernetes技术栈-K8s | Docker | Istio | Python | Golang | 云原生:https://www.k8stech.net/ |
监控神器:Prometheus 轻松入门,真香!:https://mp.weixin.qq.com/s/W38FcwGmwPj1tp_87FVC1A