Metrics

The Genesis Platform provides metrics both on the framework and at the application level.

Framework metrics are provided out of the box. Once these have been enabled, they provide base-line information. You can build on top of this to provide application-specific metrics.

Metrics allow in-depth monitoring of running applications and early detection of issues.

tip

It is recommended that you always enable metrics in production applications. This helps early detection of problems and gives you invaluable context if an incident occurs.

Enabling metrics

To enable metrics on your application, add the following settings to your system definition file:

item(name = "MetricsEnabled", value = "true")
item(name = "MetricsReportType", value = "{see below for details}")
item(name = "MetricsClassLoaderStatsEnabled", value = "true")
item(name = "MetricsProcessorStatsEnabled", value = "true")

Property	Default	Explanation
MetricsEnabled	`false`	enables metrics for your application
MetricsReportType	n/a	a comma-separated list of metrics reporters
MetricsProcessorStatsEnabled	`false`	enables additional process metrics
MetricsClassLoaderStatsEnabled	`false`	enables additional classloader metrics

Metric names

The framework provides an abstraction on top of the reporters. A single interface is used, regardless of how the metrics are reported.

Each metric has an identifier that consists of a name and a series of tags.

The framework provides these tags for every metric:

Metric	Explanation
groupName	process group, e.g. GENESIS or AUTH
processName	the name of the process e.g. GENESIS_ROUTER
hostname	the hostname e.g. TAM_PROD1

In addition to these tags, each metric can provide additional tags. How these tags are reported depends on the metric reporter.

Metric reporters

You can specify multiple reporters. Your list must be comma-separated, e.g. GRAPHITE,DATADOG. The reporter affects the name of metrics. Within the framework, metrics have a name and tags.

The available reporters are:

Reporter	Description
GRAPHITE	Publishes metrics to a Graphite server
DATADOG	Publishes metrics to a Datadog server
SLF4J	Publishes metrics to the services' log file

Graphite

Graphite is an open-source sink that you can use for metrics in Genesis applications.

Graphite is often paired with Grafana, which can source data from Graphite and provides the means to set up and display alerts.

Graphite reports metrics in a hierarchical - tree - structure.

There are two ways for Genesis to generate the metric id for graphite: either hierarchical or dimensional.

Hierarchical metric structure

When the hierarchical structure is used, the metric id will be built of the following components separated by dots ..

genesis
the value of the group name tag
the value of the process name tag
the value of the host name tag
additional tags, including the key and value
the metric name

e.g. genesis.genesis.genesis_router.tam_prod1.active_connections

Dimensional metric structure

When the dimensional metric structure is used, the metric id will begin with the metric name and all tag keys and values are joined.

e.g. active_connections{hostName=tam_prod1.processName=genesis_router.groupName=genesis}.

Settings

Genesis supports Pickle protocol, defaulting to port 2004. If your Graphite instance exposes a different port for the Pickle protocol, update your system definition as shown below:

item(name = "MetricsGraphiteURL", value = "localhost")
item(name = "MetricsGraphitePort", value = "2004")
item(name = "MetricsStructureType", value = "hierarchical")

info

Currently, the Genesis Application Platform does not support UDP and PLAINTEXT.

For the MetricsStructureType, Genesis metrics supports both hierarchical and dimensional.

Datadog

Datadog is a proprietary cloud-based metrics sink that you can use with Genesis.

Datadog supports tags, and these are provided, along with the metric name.

To use Datadog, make the following three settings to your application's system definition file:

item(name = "MetricsDatadogApiKey", value = "YOUR_API_KEY")
item(name = "MetricsDatadogApplicationKey", value = "YOUR_APP_KEY")
item(name = "MetricsDatadogUri", value = "https://api.datadoghq.com")

SLF4J

Metrics can be reported straight to the log file.

This is an easy way to test metrics during development and testing; however, it is not recommended in a production environment.

There are three optional properties for the logger reporter:

item(name = "MetricsReportIntervalSecs", value = "60")
item(name = "Slf4jReporterLoggingLevel", value = "DEBUG")
item(name = "Slf4jReporterLogInactive", value = "true")

Property	Default	Explanation
MetricsReportIntervalSecs	`60`	interval between metrics log messages
Slf4jReporterLoggingLevel	`DEBUG`	the log level to use
Slf4jReporterLogInactive	`true`	when set to `true`, metrics with a zero `0` will not be reported

Adding custom metrics

The MetricService class can be injected into any application code. This class is the entry point for registering and tracking metrics.

There are no additional dependencies to add for using the metric service, for example:

Kotlin
Java

class MetricsSample @Inject constructor(
    private val metricService: MetricService,
) {
    // more class here...

}

public class MetricsSample {
    private final MetricService metricService;
    @Inject public MetricsSample(MetricService metricService) {
        this.metricService = metricService;
    }
}

Metrics names and tags

When registering metrics, you should provide a metric name and, optionally, you can provide tags.

The way that the metric is then displayed depends on the reporter that you select.

In addition to the provided tags, the Genesis Platform automatically adds the following tags to all metrics:

process name
host name
process group name

It is recommended that you make your metric names clear and distinct. for example, processing_latency rather than just latency.

In addition to a name, tags can also be provided.

Each tag is a key-value-pair that is reported alongside the metric name.

The syntax for registering all metric types is the same, so we will use counter as an example name here.

Tags can be provided as a vararg or as a map.

Kotlin
Java

val counter = metricService.counter(
    "update_queue.message_rate",
    "topic" to topic,
)

var counter = metricService.counter(
    "message_rate",
    new Pair("topic", topic)
);

Types of metric

The Genesis Platform provides support for three types of metric.

Type	Explanation	Example
counter	Tracks a rate, monotonically increasing	Messages received
timer	Measures timing	Message latency
gauge	Tracks an numerical value	Number of messages queued
gaugeCounter	Gauge with a counter-like interface	Number of messages queued

Gauge or counter?

Use a counter when the rate is interesting. Use a gauge when the number is interesting.

Let us consider an example. When you track the number of messages received, you are usually interested in the number of messages a system handles over an interval, not the total number of messages received since the last restart.

An increase in the rate indicates a higher load on the system.

Conversely, with gauges, the number itself is meaningful; for example, the number of connections available in the connection pool. You don't want to measure how often connections are requested and released in the application. However, you do want to track that connections are always available during application runtime.

Note that a timer also counts the occurrences, so you never need to use a counter and a timer in the same place.

Counter

A counter is used to track an ever-increasing number of specific events. You can think of this as providing a rate over time.

It is generally more interesting to see how many messages have been processed in the last 5 minutes, than it is to see how many messages have been processed since a service was last started.

Example of counters:

update queue counter - how many database updates on a table are published every interval
message counter - how many messages does a service receive at every interval

Usage

Once a counter has been declared, to use it, just use the increment() function. In this example we will assume that a counter with name myCounter has been declared.

Kotlin
Java

    myCounter.increment()

    myCounter.increment();

Timer

Timers are used for two things:

to time specific events
to count the number of events

So you never need to use both a timer and a counter to track the same event.

Examples of timers are:

message latency - how long a service takes to process a specific message
query latency - how long a database operation takes

Usage

To use a timer: start the sample first, and then register it on completion.

The example below declares a timer with the name myTimer.

Kotlin
Java

val sample = Timer.start()
doLotsOfWork()
sample.stop(myTimer)

var sample = Timer.start();
doLotsOfWork();
sample.stop(myTimer);

Gauge

Gauges are useful when you are not interested in a rate or a latency, but you need to measure an absolute value that can go either up or down.

For example:

queued messages - how many messages are waiting to be processed
memory usage - how much memory is available to a service

There is a clear distinction between counters and gauges:

Counters are concerned with the rate of events, e.g. the number of messages in an interval
With a gauge we are concerned with the absolute value, e.g. what is the number of outstanding messages

Usage

Gauges require a more verbose syntax. You need to register a class, along with a way to extract a value:

Kotlin
Java

// register the gauge when initialising the class
val myTags = HashMap<String, String>()
myTags["TAG"] = "VALUE"
val myGauge = metricService.gauge(
    "message_rate",
    myTags,
    myClass,
    MyClass::value
)

// update the value when required
myClass.value = 12.0

// register the gauge when initialising the class
var myTags = new HashMap<String, String>();
myTags.put("TAG", "VALUE");
var myGauge = metricService.gauge(
    "message_rate",
    myTags,
    myClass,
    MyClass::getRecord
);

// update the value when required
myClass.setValue(12.0);

GaugeCounter

A gauge counter is a gauge that is set up to behave like a counter. The benefit is that the counter can increase as well as decrease.

Usage

Kotlin
Java

// register the gauge when initialising the class
myGaugeCounter = metricService.gaugeCounter(
    "message_rate",
    "topic" to "topic",
)

// update the value when required
myGaugeCounter.increment()
// or
myGaugeCounter.decrement()

// register the gauge when initialising the class
var myGaugeCounter = metricService.gaugeCounter(
    "message_rate",
    new Pair<>("topic", "topic")
);

// update the value when required
myGaugeCounter.increment();
// or
myGaugeCounter.decrement();

Enabling metrics​

Metric names​

Metric reporters​

Graphite​

Hierarchical metric structure​

Dimensional metric structure​

Settings​

Datadog​

SLF4J​

Adding custom metrics​

Metrics names and tags​

Types of metric​

Counter​

Usage​

Timer​

Usage​

Gauge​

Usage​

GaugeCounter​

Usage​

Enabling metrics

Metric names

Metric reporters

Graphite

Hierarchical metric structure

Dimensional metric structure

Settings

Datadog

SLF4J

Adding custom metrics

Metrics names and tags

Types of metric

Counter

Usage

Timer

Usage

Gauge

Usage

GaugeCounter

Usage