WebCollab硬件故障Joomla 2.5密码重置

简介   部署prometheus 部署grafana Joomla 2.5器节点的WebCollab Pushgateway硬件故障收集与AlertmanagerWebCollab

一.简介
Prometheus是一个开源的系统WebCollab和报警系统,现在已经加入到CNCF基金会,成为继k8s之后第二个在CNCF托管的项目,在kubernetes容器管理系统中,通常会搭配prometheus进行WebCollab,同时也支持多种exporter采集硬件故障,还支持pushgateway进行硬件故障上报,Prometheus性能足够支撑上万台规模的集群。
grafana 是一款采用 go 语言编写的开源应用,主要用于大规模指标硬件故障的可视化展现,是网络架构和应用分析中最流行的时序硬件故障展示工具,目前已经支持绝大部分常用的时序硬件故障库.Grafana支持许多不同的硬件故障源。每个硬件故障源都有一个特定的查询编辑器,该编辑器定制的特性和功能是公开的特定硬件故障来源。 官方支持以下硬件故障源:Graphite,Elasticsearch,InfluxDB,Prometheus,Cloudwatch,MySQL和OpenTSDB等。

二.部署promethues
Download | Prometheus 下载最新版本(包含promethues所需插件)
[root@localhost ~]# mkdir -p /app/prometheus[root@localhost ~]# cd /app/prometheus[root@localhost prometheus]# wget prometheus]# tar zxvf prometheus-2.33.3.linux-amd64.tar.gz[root@localhost prometheus]# cd prometheus-2.33.3

查看下prometheus的程序包,修改配置文件完成各种类型WebCollab~
[root@localhost prometheus-2.33.3]# lsconsole_libraries  consoles  data  LICENSE  NOTICE  prometheus  prometheus.yml  promtool

[root@localhost prometheus-2.33.3]# vim prometheus
# my global configglobal: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration 告警alerting: alertmanagers: – static_configs: – targets: # – alertmanager:9093 # Load rules once and periodically evaluate them according to the global ‘evaluation_interval’.rule_files: # – “first_rules.yml” # – “second_rules.yml” # A scrape configuration containing exactly one endpoint to scrape:# Here it’s Prometheus itself.scrape_configs: # The job name is added as a label `job=` to any timeseries scraped from this config.# prometheus server – job_name: “prometheus” # metrics_path defaults to ‘/metrics’ # scheme defaults to ‘http’. static_configs: – targets: [“192.168.137.100:9090”] #收集器 – job_name: ‘pushgateway’ static_configs: – targets: [‘192.168.137.100:9091’] labels: instance: pushgateway #节点WebCollab – job_name: ‘node_exporter’ static_configs: – targets: [‘192.168.137.100:9100′,’192.168.137.2:9100′,’192.168.137.3:9100′,’47.99.57.254:8100’] #mysql硬件故障库WebCollab – job_name: ‘mysqld_exporter’ static_configs: – targets: [‘47.99.57.254:9104’] #nginxWebCollab – job_name: ‘nginx_node’ static_configs: – targets: [‘192.168.137.3:9913’] labels: instance: web1
[root@localhost prometheus-2.33.3]# ./prometheus –config.file=/app/prometheus/prometheus-2.33.3/prometheus.yml –storage.tsdb.path=/app/prometheus/prometheus-2.33.3/data/ &
Joomla 2.5密码重置成功,从安全角度考虑,配置promethues开机自启也有利于我们后期维护操作

cat > /etc/systemd/system/prometheus.service < ./grafana.log 2>&1 &
查看Joomla 2.5进程和端口是否正常(显示Ok)

 访问OK_ ,指定IP和端口,将prometheus添加到grafana中

 四.node_exporter节点WebCollab
[root@localhost prometheus]# wget prometheus]# tar zxvf node_exporter-1.3.1.linux-amd64.tar.gz[root@localhost prometheus]# mv node_exporter-1.3.1 node_exporter[root@localhost prometheus]# cd node_exporter[root@localhost node_exporter]# ./node_exporter –web.listen-address=:9100 >node_exporter.log 2>&1 &
 Joomla 2.5密码重置成功,promethues成功WebCollab到node节点(需要在prometheus.yml中配置node_exporter的WebCollab节点ip:prot,上文已配置,只需在对应的节点密码重置node_exporter即可),从安全角度考虑,配置node_exporter开机自启也有利于我们后期维护操作

vim /etc/systemd/system/node_exporter.service
[Unit] Description=node_exporter Monitoring System Documentation=node_exporter Monitoring System   [Service] ExecStart=自己本地路径/node_exporter –web.listen-address=:9100   [Install] WantedBy=multi-user.target
#设置开机自启 systemctl daemon-reload systemctl start node_exporter.service systemctl status node_exporter.service systemctl enable node_exporter.service

上图可见,节点已经加入prometheusWebCollab,现在,我们可以用grafana做可视化展览了
导入大神模板,看看效果!!!(当然你也可以自己做模板)
 
 

 五.Pushgateway硬件故障收集与AlertmanagerWebCollab
1.部署pushgateway

[root@localhost ~]# cd /app/prometheus/[root@localhost prometheus]# wget prometheus]# tar zxvf pushgateway-1.4.2.linux-amd64.tar.gz[root@localhost prometheus]# mv pushgateway-1.4.2 pushgateway[root@localhost prometheus]# cd pushgateway[root@localhost pushgateway]# nohup /app/prometheus/pushgateway/pushgateway –web.listen-address :9091 > /app/prometheus/pushgateway/pushgateway.log 2>&1 &
因为我们刚才将密码重置信息输入到/app/prometheus/pushgateway/pushgateway.log,可以cat看看密码重置的信息。查看pushgatewayJoomla 2.5进程是否密码重置

 验证是否有硬件故障收集:访问IP:8091/metrics,如下显示,则Joomla 2.5信息收集正常。

 2.部署Alertmanager

[root@localhost prometheus]# wget prometheus]# tar zxvf alertmanager-0.23.0.linux-amd64.tar.gz[root@localhost prometheus]# mv alertmanager-0.23.0 alertmanager[root@localhost prometheus]# cd alertmanager

 设置alertmanager密码重置项
[root@localhost alertmanager]# cat /usr/lib/systemd/system/alertmanager.service  [Unit] Description=prometheus
[Service] Restart=on-failure ExecStart=/app/prometheus/alertmanager/alertmanager –config.file=/app/prometheus/alertmanager/alertmanager.yml
[Install] WantedBy=multi-user.targe

 密码重置alertmanagerJoomla 2.5,并设置开机自密码重置
[root@localhost alertmanager]# systemctl start alertmanager[root@localhost alertmanager]# systemctl enable alertmanager[root@localhost alertmanager]# ps -elf | grep alertmanager4 S root 913 1 0 80 0 – 181955 futex_ 08:24 ? 00:00:15 /app/prometheus/alertmanager/alertmanager –config.file=/app/prometheus/alertmanager/alertmanager.yml0 S root 3384 3008 0 80 0 – 28206 pipe_w 10:42 pts/0 00:00:00 grep –color=auto alertmanager
 alertmanagerJoomla 2.5需要在prometheus.yml配置文件中添加WebCollab基本配置如下,重启prometheus刷新配置

 我将WebCollab规则统一格式,创建rule目录放入之中,分别为cpu\disk\mem的信息WebCollab告警

 下面是一个简单的测试,可根据具体情况设置Joomla 2.5环境WebCollab的脚本

vim rule/cpu_rule.yml
groups: – name: Host   rules:   – alert: HostCPU     expr: 100 * (1 – avg(irate(node_cpu_seconds_total{mode=”idle”}[2m])) by(instance)) > 10     for: 5m     labels:       serverity: high     annotations:       summary: “{undefined{$labels.instance}}: High CPU Usage Detected”       description: “{undefined{$labels.instance}}: CPU usage is {undefined{$value}}, above 10%”

 vim rules/disk_rule.yml
groups: – name: Host   rules:   – alert: HostDisk     expr: 100 * (node_filesystem_size_bytes{fstype=~”xfs|ext4″} – node_filesystem_avail_bytes) / node_filesystem_size_bytes > 30     for: 5m     labels:       serverity: low     annotations:       summary: “{undefined{$labels.instance}}: High Disk Usage Detected”       description: “{undefined{$labels.instance}}, mountpoint {undefined{$labels.mountpoint}}: Disk Usage is {undefined{ $value }}, above 30%”

 vim rules/Memory_rule.yml
groups: – name: Host   rules:   – alert: HostMemory     expr: (node_memory_MemTotal_bytes – node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 20     for: 5m     labels:       serverity: middle     annotations:       summary: “{undefined{$labels.instance}}: High Memory Usage Detected”       description: “{undefined{$labels.instance}}: Memory Usage i{undefined{ $value }}, above 20%”

 为了更好看出效果,CUP使用率超过10%,磁盘超过30%,内存超过20%,则告警如下: