Harnessing the Titans: What Product Managers Should Know About Prometheus & Thanos ?

Rohit Verma
4 min readOct 22, 2023

In the pantheon of software, Prometheus and Thanos are not merely mythological figures but critical systems that oversee the realm of monitoring and scalability in cloud-native environments. For product managers navigating the odyssey of delivering reliable, scalable products, understanding these tools is akin to possessing the master key to effective service monitoring and data management. This article unfolds the tapestry of Prometheus and Thanos, illustrating their prowess through detailed examples and delineating their significance in a product manager’s arsenal.

What is Prometheus?

Prometheus is an open-source monitoring and alerting toolkit originally built at SoundCloud. It has since become a part of the Cloud Native Computing Foundation. Here’s what you need to know:

  1. Data Collection: Prometheus uses a pull model to scrape metrics from instrumented jobs.
  2. Flexible Query Language: It provides a flexible query language called PromQL that allows for slicing and dicing of collected data.
  3. Storage: It stores time series data in its own custom database, but the storage is limited to a single node.
  4. Alerting: Prometheus comes with its own alert manager, which handles alerts and routes them to the appropriate channels.

What is Thanos?

While Prometheus is powerful, it has limitations, especially when it comes to long-term storage and high availability. Enter Thanos.

Thanos is an extension to Prometheus that provides:

  1. Global Query View: It aggregates data from multiple Prometheus instances, giving a unified querying interface.
  2. Long-Term Storage: With Thanos, you can store your metrics in object storage like AWS S3, Google Cloud Storage, or any S3-compatible storage.
  3. High Availability: It ensures that even if one of your Prometheus instances fails, you can still access its data.
  4. Downsampling and Compression: Thanos compacts and downsamples data for efficient long-term storage.

Why Should Product Managers Care?

  1. Informed Decision Making: Understanding system health and performance can guide product decisions. For instance, if a particular feature is causing system strain, it might need to be optimized or rethought.
  2. Customer Experience: Slow or non-responsive applications lead to poor user experiences. Monitoring can help detect such issues before they affect a large number of users.
  3. Resource Allocation: Knowing the system’s performance can help in making decisions about infrastructure costs and where to invest in terms of scalability.

Real-World Examples

Example 1: Feature Rollout

Imagine you’re rolling out a new feature. With Prometheus, you can monitor the feature’s impact in real-time. If there’s an unexpected spike in resource usage or errors, you can quickly roll back or make necessary adjustments.

Example 2: Long-Term Analysis

With Thanos, you can analyze the performance data of your product over a long period. This can provide insights into seasonal trends, the impact of marketing campaigns, or the long-term effects of certain features.

Example 3: Infrastructure Planning

By monitoring system health and performance, you can make informed decisions about when to scale up (or down) your infrastructure, potentially saving costs and ensuring smooth user experiences.

Here’s the strategy map outlining how Product Managers can deploy Prometheus and Thanos in practice:

Prometheus and Thanos are powerful tools that provide insights into the health and performance of software products. As a product manager, understanding these tools can empower you to make better decisions, ensure a great user experience, and optimize resources.

Thanks for reading! If you’ve got ideas to contribute to this conversation please comment. If you like what you read and want to see more, clap me some love! Follow me here, or connect with me on LinkedIn or Twitter.

Do check out my latest Product Management resources 👇

--

--

Rohit Verma
Rohit Verma

Written by Rohit Verma

Group Product Manager @AngelOne, ex-@Flipkart, @Cleartrip @IIM Bangalore. https://topmate.io/rohit_verma_pm

No responses yet