Prometheus and VictoriaMetrics: Resilient Metrics Storage Infrastructure

In this article, my colleague Luca Carboni, DevOps Engineer from Miro's Amsterdam office, explains what our metrics storage infrastructure looks like. All components in it comply with the principles of high availability (High Availability) and fault tolerance (Fault Tolerance), have a clear specialization, can store data for a long time and are optimal in terms of costs.





The stack in question: Prometheus, Alertmanager, Pushgateway, Blackbox exporter, Grafana, and VictoriaMetrics.





Configuring High Availability and Fault Tolerance for Prometheus

Prometheus federation, Prometheus. , Grafana : , - .





, . , Prometheus , .





, . (prometheus.yml) , . A B .





. IaC ( ) Terraform (CM) Ansible, . , . , .Alertmanager, Pushgateway, Blackbox,





.





Alertmanager , Prometheus Alertmanager, . Alertmanager , : Prometheus A Prometheus B. IaC CM, Alertmanager .





- , . , โ€” Prometheus A Prometheus B .





Pushgateway , . . Pushgateway DNS Failover , ( active/passive). , .





Blackbox Prometheus A Prometheus B.





, Prometheus, Alertmanager, , Pushgateway active/passive Blackbox. .





. VPC (Virtual Private Cloud), , . . , . โ€” . , , .





Prometheus, , . , . . " , ".





VictoriaMetrics

Prometheus . Prometheus , . . 10 . , ? , โ€” . Prometheus , - , .





Cortex, Thanos, M3DB, VictoriaMetrics . Prometheus, โ€” , , โ€” .





, VictoriaMetrics.





VictoriaMetrics : ยซ--ยป (single-node version) (cluster version). , , . , .





โ€” . (), .





VictoriaMetrics : vmstorage ( ), vminsert ( ) vmselect ( ). , vminsert vmselect .





vminsert . , , . vminsert (stateless), , , .





, vminsert โ€” (storageNode) , (replicationFactor=N, N โ€” vmstorage). vminsert? Prometheus, remote_write.





vmstorage โ€” , VictoriaMetrics. vminsert vmselect, vmstorage (stateful), . vmstorage , (IO latency) (IOPS), , Prometheus.





vmstorage:





  • storageDataPath โ€” , ;





  • retentionPeriod โ€” ;





  • dedup.minScrapeInterval โ€” ( , ).





vmstorage , replicationFactor, vminsert, (N) .





vmstorage , , , vmstorage .





vmselect . , , . , Prometheus, , . , , Grafana. vminsert, vmselect .





Grafana

Grafana , Prometheus, , VictoriaMetrics. , VictoriaMetrics (MetricsQL) PromQL, Prometheus. Grafana.





Grafana SQLite . SQLite , , . . , PostgreSQL Amazon RDS, Multi-AZ , .





Grafana PostgreSQL. , Grafana . PostgreSQL Grafana, , vendor lock. , .





, Grafana. .





Grafana VictoriaMetrics โ€” , vmselect, โ€” Prometheus . .





***





, , . , vmstorage , Amazon S3.





, . , .





:





  • Prometheus โ€” https://prometheus.io/





  • Alertmanager โ€” https://github.com/prometheus/alertmanager





  • Pushgateway โ€” https://github.com/prometheus/pushgateway





  • Blackbox exporter โ€” https://github.com/prometheus/blackbox_exporter





  • โ€” https://prometheus.io/docs/instrumenting/exporters/





  • Grafana โ€” https://grafana.com/





  • VictoriaMetrics โ€” https://victoriametrics.com/









Miro.












All Articles