๐ŸŽ› Data filtering

Control what and how much data is sent to Helios to ensure you see what's important without the noise, and also meet span consumption targets.

Controlling what data and how much of it is collected and sent to Helios can come in handy in case you are trying to clear irrelevant data from the cluttered views, or meet a specific quota based on your current plan in Helios (ultimately driven by a monthly count of spans).

You may take one of two approaches - or use a combination:

  1. Sample data sent from specific services
  2. Filter specific types or formats of data that are generally not interesting

๐Ÿ“˜

Learn more on how to analyze span consumption breakdown in Helios.

Service sampling ratio

In order to fine-tune the amount of data being collected by Helios across your microservices application, the SDK allows configuring the Sampling Ratio per service, which affects the portion of data being tracked.

By default, once initialized, Helios traces all data inside a service.

Under some circumstances, this might result, for example, in one service quickly using-up your monthly data quota, leaving data traced by other services out of scope.

Controlling the sampling ratio of each service can be specifically useful when different parts of your system handle different scales of data, or data of varying importance.

To configure the sampling ratio for a service, set the HS_SAMPLING_RATIO env var with a decimal value between 0 and 1, i.e. 0.1 or 0.0001 etc.

See the detailed installation docs for specific settings per your relevant SDK language.

๐Ÿšง

You should note that for each trace, the root span will determine the sampling for the entire trace. For example, if serviceA is calling serviceB then any applied sampling ratio on serviceB will not take affect for such traces, only the sampling ration applied to serviceA

Seeing service sampling ratio in Helios

When selecting a service, an API or an outgoing operation, the sampling ratio is shown in the top-right widget if the data is indeed sampled.

The entities-service has a sampling ratio of 0.000001

The entities-service has a sampling ratio of 0.000001

You can quickly retrieve the latest sampling ratio for each service (in every environment) by going to Settings > Plans > Current span usage. Export the CSV report for the current period. One of the columns in the report is exactly the latest sampling ratio.

Exporting the CSV report of the current span usage in Helios

Exporting the CSV report of the current span usage in Helios

Filtering specific data

To further control what data is being traced and reported to Helios, additional controllers are available.

HTTP pattern filters

In many cases, applications interact with other applications over HTTP constantly while they are up and running.

Some examples are:

  • Exposing a health-check API call for the app, which is being monitored closely.
  • Sending-out similar calls to assess the availability of third-party applications.
  • Reporting metrics to external applications.
  • Many more ...

All of these interactions can add-up quickly, and can result in both limiting your use of quota for other, more important, operations, or just cause you not be able to see the wood for the trees, with all this data flying around.

Helios OpenTelemetry SDK

Depending on the specific SDK version and language, Helios allows you defining filter-patterns, to avoid such irrelevant data from being collected.

In general, you can configure a list of patterns which you want to suppress tracing for using environment variables:

HS_IGNORED_OUTGOING_URLS=www.newrelic.com,www.google.com
HS_IGNORED_INCOMING_PATHS=/health
HS_EXCLUDED_URLS=/health,www.newrelic.com,www.google.com

This will result in not reporting any HTTP requests (incoming or outgoing) to /health or any outgoing reports to www.newrelic.com or www.google.com.

See the detailed instructions for enabling this under the Configuration section for your relevant version of the SDK.

OpenTelemetry SDKs

If you are using one of the native OpenTelemetry SDKs to collect and report data to Helios, please refer to the official OpenTelemetry documentations, depending on your service's language: