监控篇——部署AlertManager+企业微信告警

一、安装准备

以下基于场景prometheus环境

  1. 下载安装包
wget https://github.com/prometheus/alertmanager/releases/download/v0.19.0/alertmanager-0.1
9.0.linux-amd64.tar.gz

2. 部署安装包

tar -zvxf alertmanager-0.19.0.linux-amd64.tar.gz
cp -a alertmanager-0.19.0.linux-amd64/ /usr/local/alertmanager

二、部署

1.修改alertmanager.yml配置

cat /usr/local/alertmanager/alertmanager.yml

global:
resolve_timeout: 5m
templates: #告警模板
- './template/test.tmpl'
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1m
receiver: 'wechat'
receivers:
- name: 'wechat'
wechat_configs:
- send_resolved: true
agent_id: '1000002' # 自建应用的agentId
to_user: 'LuZhanXing' # 接收告警消息的人员Id
api_secret: '' # 自建应用的secret
corp_id: 'ww4bcc83412351e94e' # 企业ID
#inhibit_rules:
#- source_match:
#severity: 'critical'
#target_match:
#severity: 'warning'
#equal: ['alertname', 'dev', 'instance']

2. 新建一个模板

mkdir -p /usr/local/alertmanager/template #新建一个目录

cat /usr/local/alertmanager/template/test.tmpl
{{ define "wechat.default.message" }}
{{ range .Alerts }}
========监控报警==========
告警状态:{{ .Status }}
告警级别:{{ .Labels.severity }}
告警类型:{{ .Labels.alertname }}
告警应用:{{ .Annotations.summary }}
告警主机:{{ .Labels.instance }}
告警详情:{{ .Annotations.description }}
触发阀值:{{ .Annotations.value }}
告警时间:{{ .StartsAt.Format "2006-01-02 15:04:05" }} ========end============= {{
end }} {{ end }}

3. 启动服务

nohup ./alertmanager &

三、配置Prometheus的配置

注意,因为已经再docker环境下部署prometheus时候挂在了配置目录,所以要重新删除容器重建容器

  1. 停止并删除容器
docker stop prometheus && docker rm promethues   #执行

2. 单独部署Prometheus的yml配置

version: '2'
networks:
monitor:
driver: bridge
services:
prometheus:
image: prom/prometheus
container_name: prometheus
hostname: prometheus
restart: always
volumes:
- /root/promethus/prometheus.yml:/etc/prometheus/prometheus.yml
- /root/promethus/rule.yml:/etc/prometheus/rule.yml
ports:
- "9090:9090"
networks:
- monitor

3. 在promethues.yml增加如下配置

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets: ['192.168.1.81:9093']
# Load rules once and periodically evaluate them according to the global
'evaluation_interval'.
rule_files:
- "rule.yml"
# - "first_rules.yml"
# - "second_rules.yml"

4. 配置告警的规则

cat rule.yml
groups:
- name: server-rule
rules:
- alert: "内存告警"
expr: (node_memory_MemTotal_bytes -
(node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes ))
/ node_memory_MemTotal_bytes * 100 > 90
for: 30s
labels:
severity: warning
annotations:
summary: "服务名:{{$labels.alertname}} 内存告警"
description: "{{ $labels.alertname }} 内存资源利用率大于 90%"
value: "{{ $value }}"
- alert: "CPU告警"
expr: 100 * (1 - avg(irate(node_cpu_seconds_total{mode="idle"}[2m]))
by(instance)) > 50
for: 30s
labels:
severity: warning
annotations:
summary: "服务名:{{$labels.alertname}} CPU告警"
description: "{{ $labels.alertname }} CPU资源利用率大于 50%"
value: "{{ $value }}"
- alert: "磁盘告警"
expr: 100 * (node_filesystem_size_bytes{fstype=~"xfs|ext4"} -
node_filesystem_avail_bytes) / node_filesystem_size_bytes > 90
for: 30s
labels:
severity: warning
annotations:
summary: "服务名:{{$labels.alertname}} 磁盘告警"
description: "{{ $labels.alertname }} 磁盘利用率大于 90%"
value: "{{ $value }}"

5. 重建容器

nohup docker-compose up

6. 查看日志

docker logs promethues

四、在企业微信后台的设置

  1. 打开企业微信管理后台

2. 如下图

监控篇——部署AlertManager+企业微信告警
监控篇——部署AlertManager+企业微信告警
监控篇——部署AlertManager+企业微信告警

这里的AgentID和Secret是和alertmanager.yml的配置上同步的

五、告警效果

监控篇——部署AlertManager+企业微信告警

原创文章,作者:admin,如若转载,请注明出处:https://www.starz.top/2022/07/05/%e9%83%a8%e7%bd%b2alertmanager%e4%bc%81%e4%b8%9a%e5%be%ae%e4%bf%a1%e5%91%8a%e8%ad%a6/

发表评论

邮箱地址不会被公开。 必填项已用*标注