Saturday, April 18, 2026
HomeBig DataManufacturing Visibility: Metrics Monitoring and Alerting

Manufacturing Visibility: Metrics Monitoring and Alerting

[ad_1]

Pulling again the curtain

One factor that makes Rockset so magical is the truth that it “simply works”. After years of rigorously provisioning, managing, and tuning their knowledge programs, clients really feel that Rockset’s serverless providing is just too good to be true (we’ve heard this actual phrase from many purchasers!). We pleasure ourselves on having abstracted away the Rube Goldberg-like complexities inherent in sustaining indexes and ETL pipelines. Offering visibility into this complexity is a obligatory step to assist our customers get probably the most out of our product.

As our customers more and more depend on Rockset to energy their purposes, the identical questions come up:

  • How do I do know after I have to improve to a bigger Digital Occasion?
  • Can I combine Rockset into my monitoring and alerting toolkit?
  • How can I measure my knowledge and question latencies?

To assist reply these questions, we got down to present our customers every thing they might probably wish to see, avoiding the overly difficult.

We needed to supply a 360 diploma overview for each customers within the extra exploratory, constructing part, in addition to these able to take Rockset into manufacturing. With this in thoughts, we constructed two new options:

  1. Monitoring and metrics instantly constructed into the Rockset Console
  2. The means to combine with present third-party monitoring companies


production-visibility-monitoring-diagram-1

Bringing our customers real-time metrics on the well being of their Rockset assets felt like the right alternative for dogfooding, and so dogfood we did. Our new Metrics Dashboard is totally powered by Rockset!

Selecting metrics

At a excessive degree, integrating Rockset into your software consists of:

  1. Selecting a Digital Occasion
  2. Ingesting knowledge into collections
  3. Querying your knowledge

The metrics we offer mirror this stream, and holistically cowl any issues chances are you’ll come throughout when constructing with Rockset. They fall into 4 classes:

  1. Digital Occasion

    1. CPU utilization (by leaf / aggregator)
    2. Allotted compute (by leaf / aggregator)
    3. Reminiscence utilization (by leaf / aggregator)
    4. Allotted reminiscence (by leaf / aggregator)
  2. Question

    1. Rely (whole depend, and by Question Lambda)
    2. Latency (latency throughout all queries, and by Question Lambda)
    3. 4XX, and 5XX errors (whole depend)
  3. Ingest

    1. Replication lag (by assortment)
    2. Ingest errors (whole throughout all collections)
    3. Streaming ingest (whole throughout all collections)
    4. Bulk ingest (whole throughout all collections)
  4. Storage

    1. Complete storage dimension (whole throughout all collections, and by assortment)
    2. Complete doc depend (whole throughout all collections, and by assortment)

In selecting what metrics to point out, our baseline aim was to assist customers reply probably the most recurring questions, resembling:

  • How can I measure my question latencies? What latencies have my Question Lambda’s been seeing? Have there been any question errors? Why did my question latency spike?

    • Question latencies and errors throughout all queries can be found, in addition to latencies and errors per Question Lambda in Question Lambda particulars.
    • In case your question latency spiked within the final 24 hours, you possibly can drill down by taking a look at question latency by Question Lambda if relevant.
  • What is the knowledge latency between my exterior supply and my Rockset assortment? Have there been any parse errors?

    • Now you can view replication lag, in addition to ingest parse errors per assortment in assortment particulars.
  • How do I do know after I have to improve to a bigger Digital Occasion?

    • When you see spikes in CPU or reminiscence utilization in your Digital Occasion, it’s best to in all probability stagger your question load, or improve to a bigger Digital Occasion. Test your question depend / latency for the corresponding timestamp for additional affirmation.

Try the brand new dashboard within the Metrics tab of the Rockset Console!

Integrating with present monitoring companies

When you’ve finished your due diligence and have determined that Rockset is the fitting match in your software wants, you’ll possible wish to combine Rockset into your present monitoring and alerting workflows.

Each group has distinctive monitoring and alerting wants, and employs an unlimited array of third-party instruments and frameworks. As an alternative of reinventing the wheel and attempting to construct our personal framework, we needed to construct a mechanism that might allow customers to combine Rockset into any present instrument available on the market.

Metrics Endpoint

We expose metrics knowledge in a Prometheus scraping appropriate format, already an open normal, enabling you to combine with lots of the hottest monitoring companies, together with however not restricted to:

  1. Prometheus (and Alertmanager)
  2. Grafana
  3. Graphite
  4. Datadog
  5. AppDynamics
  6. Dynatrace
  7. New Relic
  8. Amazon CloudWatch
  9. …and lots of extra

$ curl https://api.rs2.usw2.rockset.com/v1/orgs/self/metrics -u {API key}:

# HELP rockset_collections Variety of collections.
# TYPE rockset_collections gauge
rockset_collections{virtual_instance_id="30",workspace_name="commons",} 20.0 
rockset_collections{virtual_instance_id="30",workspace_name="myWorkspace",} 2.0 
rockset_collections{virtual_instance_id="30",workspace_name="myOtherWorkspace",} 1.0
# HELP rockset_collection_size_bytes Assortment dimension in bytes. 
# TYPE rockset_collection_size_bytes gauge 
rockset_collection_size_bytes{virtual_instance_id="30",workspace_name="commons",collection_name="_events",} 3.74311622E8 
...

With this endpoint and the instruments it integrates with, you possibly can:

  • Programmatically monitor the state of your Rockset manufacturing metrics
  • Configure motion gadgets, like alerts based mostly on CPU utilization thresholds
  • Arrange auto remediation by altering Digital Occasion sizes based mostly on manufacturing masses

This endpoint is disabled by default, and may be switched on within the Metrics tab (https://console.rockset.com/metrics) within the Rockset Console.


production-visibility-monitoring-diagram-2

At Rockset, Grafana and Prometheus are two of our important monitoring pillars. For the rollout of our visibility efforts, we arrange our inner Prometheus scraper to hit our personal metrics endpoint. From there, we created charts and alerts which we use in tandem with present metrics and alerts to watch the brand new function!

Get began

An in depth breakdown of the metrics we export via our new endpoint is accessible in our documentation.

We’ve got a fundamental Prometheus configuration file and an Alertmanager guidelines template in our group Github repository that will help you get began.



[ad_2]

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments