Metrics
Etn-sc includes a variety of optional metrics that can be reported to the user. However, metrics are disabled by default to save on the computational overhead for the average user. Users that choose to see more detailed metrics can enable them using the --metrics
flag when starting Etn-sc. Some metrics are classed as especially expensive and are only enabled when the --metrics.expensive
flag is supplied. For example, per-packet network traffic data is considered expensive.
The goal of the Etn-sc metrics system is that - similar to logs - arbitrary metric collections can be added to any part of the code without requiring fancy constructs to analyze them (counter variables, public interfaces, crossing over the APIs, console hooks, etc). Instead, metrics should be "updated" whenever and wherever needed and be automatically collected, surfaced through the APIs, queryable and visualizable for analysis.
Metric types
Etn-sc's metrics can be classified into four types: meters, timers, counters and guages.
Meters
Analogous to physical meters (electricity, water, etc), Etn-sc's meters are capable of measuring the amount of "things" that pass through and at the rate at which they do. A meter doesn't have a specific unit of measure (byte, block, malloc, etc), it just counts arbitrary events. At any point in time a meter can report:
Total number of events that passed through the meter
Mean throughput rate of the meter since startup (events / second)
Weighted throughput rate in the last 1, 5 and 15 minutes (events / second) ("weighted" means that recent seconds count more that in older ones*)
Timers
Timers are extensions of meters, the duration of an event is collected alongside a log of its occurrence. Similarly to meters, a timer can also measure arbitrary events but each requires a duration to be assigned individually. In addition generating all of the meter report types, a timer also reports:
Percentiles (5, 20, 50, 80, 95), reporting that some percentage of the events took less than the reported time to execute (e.g. Percentile 20 = 1.5s would mean that 20% of the measured events took less time than 1.5 seconds to execute; inherently 80%(=100%-20%) took more that 1.5s)
Percentile 5: minimum durations (this is as fast as it gets)
Percentile 50: well behaved samples (boring, just to give an idea)
Percentile 80: general performance (these should be optimised)
Percentile 95: worst case outliers (rare, just handle gracefully)
Counters
A counter is a single int64 value that can be incremented and decremented. The current value of the counter can be queried.
Gauges
A gauge is a single int64 value. Its value can increment and decrement - as with a counter - but can also be set arbitrarily.
Querying metrics
Etn-sc collects metrics if the --metrics
flag is provided at startup. Those metrics are available via an HTTP server if the --metrics.addr
flag is also provided. By default the metrics are served at 127.0.0.1:6060/debug/metrics
but a custom IP address can be provided. A custom port can also be provided to the --metrics.port
flag. More computationally expensive metrics are toggled on or off by providing or omitting the --metrics.expensive
flag. For example, to serve all metrics at the default address and port:
Navigating the browser to the given metrics address displays all the available metrics in the form of JSON data that looks similar to:
Any developer is free to add, remove or modify the available metrics as they see fit. The precise list of available metrics is always available by opening the metrics server in the browser.
Etn-sc also supports dumping metrics directly into an influx database. In order to activate this, the --metrics.influxdb
flag must be provided at startup. The API endpoint, username, password and other influxdb tags can also be provided. The available tags are:
We also provide Prometheus-formatted metrics data, which can be obtained through the http://127.0.0.1:6060/debug/metrics/prometheus
URL, eg:
Creating and updating metrics
Metrics can be added easily in the Etn-sc source code:
In order to use the same meter from two different packages without creating dependency cycles, the metrics can be created using NewOrRegisteredX()
functions. This creates a new meter if no meter with this name is available or returns the existing meter.
The name given to the metric can be any arbitrary string. However, since Etn-sc assumes it to be some meaningful sub-system hierarchy, it should be named accordingly.
Metrics can then be updated:
Summary
Etn-sc can be configured to report metrics to an HTTP server or database. These functions are disabled by default but can be configured by passing the appropriate commands on startup. Users can easily create custom metrics by adding them to the Etn-sc source code, following the instructions on this page.
Last updated