Skip to content

Recording Rules and Streaming Aggregating Rule

The previous blog briefs the basic concepts of monitoring system. This blog will introduce the recording rules provided by prometheus and streaming aggregating rules provided by VictoriaMetrics.

Recording Rules: Precompute Query

What Is Recording Rules

Due to the numerous metrics from services, pulled or pushed, the raw metrics could be millions or even more. For such numerous raw metrics, query language execution speed is slow. Recording rules are the solution to this dilemma, as the document reads:

Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their result as a new set of time series.

The recording rules are precomputed before the querying, so the speed during querying won't get disturbed from the huge metrics size. This is a feature provided by prometheus, as the picture below shows:

monitor_recording_rules_workflow.png

Pros

  • Great flexibility:
    As the recording rules doesn't care about how the metrics are collected and simply read from collected metrics inside prometheus, it provides great flexibility to use for complex querying and doesn't limit users to aggregate metrics only.

  • Support Metrics Reported by Pull or Push:
    Because recording rules read the prometheus storage directly, no matter the metrics are pulled by prometheus or pushed by services, recording rules treat them equally.

Cons

  • Possible PromQL Limitation:
    As the recording rules are executed by prometheus periodically, use PromQL to query the metrics, and generate and write new time series back to prometheus, the limitation of PromQL applies to recording rules as well. The flag --query.max-samples limits the max number of samples in a single query, and its default value is 50Mil.

  • Additional Computing/Storage/IO Resource: Recording rules are precomputing and write the new time series back. Because they cannot delete raw metrics and write new time series, additional computing, storage and IO resource of Prometheus are required.

  • Latency: Outputs of recording rules have latency comparing with the raw metrics as they're computed after raw metrics are collected.

Streaming Aggregation Rules: Real-time Data Ingesting

What Is Streaming Aggregation Rules

Streaming aggregation allows users to define aggregation rules that run on ingested data in real-time. Unlike recording rules, streaming aggregation happens in real time during data ingestion. The ingestion refers to the metrics are collected and then going to be put in storages.

First of all, streaming aggregation is not a prometheus feature at all. It could be a feature provided by VictoriaMetric, which aggregates incoming samples in streaming mode by time and by labels before data is written to remote storage (or local storage for single-node VictoriaMetrics).

img.png

Note that VictoriaMetrics streaming aggregation rules supports many protocols, both aka pull and push.

Pros

  • No limitation on time series size
  • No additional IO Resource:
    Because the aggregating happens before writing to storage, we don't need to read and write the prometheus for additional time series.
  • Optional for original metrics:
    It's configurable to discard the original metrics and store the aggregated metrics only.
  • No latency:
    As the aggregating happens during metric collection, there is no latency.

Cons

  • Limited Aggregating Rules:
    It doesn't support complex aggregating rules, and only basic aggregating instructions are provided.

Summary

For the backend users, unless the data size exceeds the limitation, the recording rules usually satisfy the requirement for speeding up queries. The streaming aggregating rule is a solution for the numerous metrics or the low latency for the aggregated metrics.

VictoriaMetrics streaming aggregating supports the pushed model as long as the metrics are pushed via any supported data ingestion protocol, as the document reads:

The aggregation is applied to all the metrics received via any supported data ingestion protocol and/or scraped from Prometheus-compatible targets after applying all the configured relabeling stages.

Hence, take care the pushing protocol you used when using streaming aggregating for pushed metrics.