Announcing Servo

admin
الجمعة، 10 فبراير 2012

By Brian Harrington & Greg Orzell

In a previous blog post about auto scaling, I mentioned that we would be open sourcing the library that we use to expose application metrics. Servo is that library. It is designed to make it easy for developers to export metrics from their application code, register them with JMX, and publish them to external monitoring systems such as Amazon's CloudWatch. This is especially important at Netflix because we are a data driven company and it is essential that we know what is going on inside our applications in near real time. As we increased our use of auto scaling based on application load, it became important for us to be able to publish custom metrics to CloudWatch so that we could configure auto-scaling policies based on the metrics that most accurately capture the load for a given application. We already had the servo framework in place to publish data to our internal monitoring system, so it was extended to allow for exporting a subset (AWS charges on a per metric basis) of the data into CloudWatch.

Features

Simple: It is trivial to expose and publish metrics without having to write lots of code such as MBean interfaces.
JMX Registration: JMX is the standard monitoring interface for Java and can be queried by many existing tools. Servo makes it easy to expose metrics to JMX so they can be viewed from a wide variety of Java tools such as VisualVM.
Flexible publishing: Once metrics are exposed, it should be easy to regularly poll the metrics and make them available for internal reporting systems, logs, and services like Amazon's CloudWatch. There is also support for filtering to reduce cost for systems that charge per metric, and asynchronous publishing to help isolate the collection from downstream systems that can have unpredictable latency.

The rest of this post provides a quick preview of Servo, for a more detailed overview see the Servo wiki.

Registering Metrics

Registering metrics is designed to be both easy and flexible. Using annotations you can call out the fields or methods that should be monitored for a class and specify both static and dynamic metadata. The example below shows a basic server class with some stats about the number of connections and amount of data that as been seen.

See the annotations wiki page for a more detailed summary of the available annotations and the options that are available. Once you have annotated your class, you will need to register each new object instance with the registry in order for the metrics to get exposed. A default registry is provided that exports metrics to JMX.

Now that the instance is registered metrics should be visible in tools like VisualVM when you run your application.

Publishing Metrics

After getting into JMX, the next step is to collect the data and make it available to other systems. The servo library provides three main interfaces for collecting and publishing data:

MetricObserver: an observer is a class that accepts updates to the metric values. Implementations are provided for keeping samples in memory, exporting to files, and exporting to CloudWatch.
MetricPoller: a poller provides a way to collect metrics from a given source. Implementations are provided for querying metrics associated with a monitor registry and arbitrary metrics exposed to JMX.
MetricFilter: filters are used to restrict the set of metrics that are polled. The filter is passed in to the poll method call so that metrics that can be expensive to collect, will be ignored as soon as possible. Implementations are provided for filtering based on a regular expression and prefixes such as package names.

The example below shows how to configure the collection of metrics each minute to store them on the local file system.

By simply using a different observer, we can instead export the metrics to a monitoring system like CloudWatch.

You have to provide your AWS credentials and namespace at initialization. Servo also provides some helpers for tagging the metrics with common dimensions such as the auto scaling group and instance id. CloudWatch data can be retrieved using the standard Amazon tools and APIs.