分布式存储系统之Ceph集群启用Dashboard及使用Prometheus监控Ceph( 五 )


在导入node_exporter的指标数据之前,我们先来了解下Prometheus 配置文件
[root@ceph-mgr02 prometheus]# cat prometheus.yml# my global configglobal:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# scrape_timeout is set to the global default (10s).# Alertmanager configurationalerting:alertmanagers:- static_configs:- targets:# - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.rule_files:# - "first_rules.yml"# - "second_rules.yml"# A scrape configuration containing exactly one endpoint to scrape:# Here it's Prometheus itself.scrape_configs:# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.- job_name: "prometheus"# metrics_path defaults to '/metrics'# scheme defaults to 'http'.static_configs:- targets: ["localhost:9090"][root@ceph-mgr02 prometheus]#提示:Prometheus 配置文件主要由global、rule_files、scrape_configs、alerting、remote_write和remote_read几个配置段组成;global:全局配置段;scrape_configs:scrape配置集合,用于定义监控的目标对象(target)的集合,以及描述如何抓?。╯crape)相关指标数据的配置参数;通常,每个scrape配置对应于一个单独的作业(job),而每个targets可通过静态配置(static_configs)直接给出定义 , 也可基于Prometheus支持的服务发现机制进行自动配置;alertmanager_configs:可由Prometheus使?的Alertmanager实例的集合,以及如何同这些Alertmanager交互的配置参数;每个Alertmanager可通过静态配置(static_configs)直接给出定义,也可基于Prometheus?持的服务发现机制进行自动配置;remote_write:配置“远程写”机制 , Prometheus需要将数据保存于外部的存储系统(例如InfluxDB)时定义此配置段,随后Prometheus将样本数据通过HTTP协议发送给由URL指定适配器(Adaptor);remote_read:配置“远程读”机制,Prometheus将接收到的查询请求交给由URL指定适配器Adpater)执行,Adapter将请求条件转换为远程存储服务中的查询请求,并将获取的响应数据转换为Prometheus可用的格式;
常用的全局配置参数
global:# How frequently to scrape targets by default.[ scrape_interval: <duration> | default = 1m ]# How long until a scrape request times out.[ scrape_timeout: <duration> | default = 10s ]# How frequently to evaluate rules.[ evaluation_interval: <duration> | default = 1m ]# The labels to add to any time series or alerts when communicating with# external systems (federation, remote storage, Alertmanager).external_labels:[ <labelname>: <labelvalue> ... ]# Rule files specifies a list of globs. Rules and alerts are read from# all matching files.rule_files:[ - <filepath_glob> ... ]# A list of scrape configurations.scrape_configs:[ - <scrape_config> ... ]# Alerting specifies settings related to the Alertmanager.alerting:alert_relabel_configs:[ - <relabel_config> ... ]alertmanagers:[ - <alertmanager_config> ... ]# Settings related to the remote write feature.remote_write:[ - <remote_write> ... ]# Settings related to the remote read feature.remote_read:[ - <remote_read> ... ]scrape配置段中,使用static_configs配置Job的语法格式
# The targets specified by the static config.targets:[ - '<host>' ]# Labels assigned to all metrics scraped from the targets.labels:[ <labelname>: <labelvalue> ... ]使用file_sd_configs配置Job的语法格式
[{"targets": [ "<host>", ... ],"labels": {"<labelname>": "<labelvalue>", ...}},...]将node_exporter指标数据纳入Prometheus server中

分布式存储系统之Ceph集群启用Dashboard及使用Prometheus监控Ceph

文章插图
提示:默认node_exporter是以/metrics输出指标数据,如果需要修改了输出路径,需要在prometheus配置文件中用metrics_path来指定其路径;
重启Prometheus server
[root@ceph-mgr02 prometheus]# systemctl restart prometheus.service[root@ceph-mgr02 prometheus]# systemctl status prometheus.service● prometheus.service - The Prometheus 2 monitoring system and time series database.Loaded: loaded (/usr/lib/systemd/system/prometheus.service; disabled; vendor preset: disabled)Active: active (running) since Sun 2022-10-09 22:20:41 CST; 9s agoDocs: https://prometheus.io Main PID: 2344 (prometheus)CGroup: /system.slice/prometheus.service└─2344 /usr/local/prometheus/prometheus --storage.tsdb.path=/var/lib/prometheus --config.file=/usr/local/prometh...Oct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.163Z caller=head.go:542 level=info compo...hile"Oct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.179Z caller=head.go:613 level=info compo...ent=1Oct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.179Z caller=head.go:613 level=info compo...ent=1Oct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.179Z caller=head.go:619 level=info compo…19721msOct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.180Z caller=main.go:993 level=info fs_ty...MAGICOct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.180Z caller=main.go:996 level=info msg="...rted"Oct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.180Z caller=main.go:1177 level=info msg=...s.ymlOct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.181Z caller=main.go:1214 level=info msg="Comp…μsOct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.181Z caller=main.go:957 level=info msg="...sts."Oct 09 22:20:41 ceph-mgr02.ilinux.io prometheus[2344]: ts=2022-10-09T14:20:41.181Z caller=manager.go:941 level=info co...r..."Hint: Some lines were ellipsized, use -l to show in full.[root@ceph-mgr02 prometheus]# ss -tnlStateRecv-Q Send-QLocal Address:PortPeer Address:PortLISTEN0128172.16.30.75:6800*:*LISTEN0128192.168.0.75:6800*:*LISTEN0128172.16.30.75:6801*:*LISTEN0128192.168.0.75:6801*:*LISTEN0128192.168.0.75:6802*:*LISTEN0128172.16.30.75:6802*:*LISTEN0128192.168.0.75:6803*:*LISTEN0128172.16.30.75:6803*:*LISTEN0128*:22*:*LISTEN0100127.0.0.1:25*:*LISTEN05*:8443*:*LISTEN0128[::]:22[::]:*LISTEN0100[::1]:25[::]:*LISTEN0128[::]:9090[::]:*[root@ceph-mgr02 prometheus]#

推荐阅读