监控的革新:Prometheus和Thanos的进化故事
Monitoring is the process of collecting and analyzing data to identify and resolve the performance of a platform and improve any reliability issues. It is essential for any organization that relies on IT systems to deliver products and services reliably. However, monitoring at scale can be challenging, particularly for organizations with complex and distributed IT environments. Prometheus is a popular tool that many organizations use to monitor their systems. It makes HTTP calls to endpoints, collects, and stores metrics as time series data. This blog discusses how to perform monitoring at scale using Prometheus and how this approach has evolved over the last few years.
监控是收集和分析数据以识别和解决平台性能问题并改善可靠性问题的过程。对于依赖IT系统可靠地提供产品和服务的任何组织来说,监控至关重要。然而,对于具有复杂和分布式IT环境的组织来说,规模化的监控可能具有挑战性。Prometheus是许多组织用于监控其系统的流行工具。它通过向端点发出HTTP调用,收集和存储指标作为时间序列数据。本文讨论了如何使用Prometheus进行规模化监控以及这种方法在过去几年中的发展。
Standalone Prometheus
独立的Prometheus
Grafana is an open-source platform for data visualization, monitoring, and analysis. It enables users to create dashboards with panels and visualize data stored in Prometheus (and other sources). Traditionally an organization starts with a standalone Prometheus scraping metrics and a Grafana instance for visualization. As the scale grows, organizations tend to have multiple standalone Prometheus for different purposes and a Grafana instance with multiple data sources for visualization. It is also not scalable and comes with challenges.
Grafana是一个用于数据可视化、监控和分析的开源平台。它使用户能够创建带有面板的仪表板,并可视化存储在Prometheus(和其他来源)中的数据。传统上,组织从一个独立的Prometheus开始抓取指标,并使用一个Grafana实例进行可视化。随着规模的增长,组织倾向于拥有多个用于不同目的的独立Prometheus,以及一个具有多个数据源的Grafana实例进行可视化。这种方式不具备可扩展性,并且存在一些挑战。
Challenges of using Standalone Prometheus
使用独立Prometheus的挑战
A standalone Prometheus supports only vertical scaling. As the number of metrics and targets to be scraped increases, a single Prometheus instance struggles to cope with the escalating workload, eventually leading to memory constraints that can hinder its performance and functionality. The need is for horizontal s...