๐žด Lambda monitoring

Monitor & troubleshoot your Lambdas across all environments and regions by leveraging Helios' integration with AWS as well as distributed tracing with OpenTelemetry to gain complete observability over all three pillars: metrics, logs and traces.

The key to effective Lambda monitoring is combining the three pillars of observability - metrics, logs and traces - in a single place for the full context and details that enable fast root cause analysis.

Helios offers a wide range of deployment, observability & troubleshooting capabilities for AWS Lambda and other serverless functions.

  • Deployment manually, through a Lambda layer or even a serverless plugin
  • Observability & monitoring based on both the instrumented traces & metrics, as well as the raw logs from AWS
  • Troubleshooting all errors - applicative or not - made simpler based on E2E traces and context propagated properly

Different installation options for Lambda observability

Install the Helios SDK in your services by leveraging any of the supported deployment options:

  1. Using Helios' Lambda layers - the most updated version is available under Settings > General in the Helios app (Recommended)
  2. Configuring Helios' serverless plugins
  3. Manually setting environment variables in Lambda configuration (Go | Node.js | Python)

Lambda observability & monitoring

๐Ÿ‘‰

AWS integration is required in order to get the complete Lambda observability & monitoring capabilities

Lambda overview

Each Lambda, represented as its own service in Helios, also provides a Lambda overview. It includes various trends & stats on metrics such as error trends, invocations, etc.

Lambda service overview with AWS integration and instrumentation

Lambda service overview, based both on the data instrumented by the Helios OpenTelemetry SDK as well as raw data from AWS

AWS Lambda status page

This Lambda status page provides a snapshot of what's going with the Lambdas used, across all regions. For each function and within the time frame selected it displays the last invocation, number of invocations, number of function errors, number of OOMs, number of timeouts, average duration, and also account ID, region and ARN. It is available under Cloud entities > Lambda.

Lambda status page based on instrumented data and AWS data

Lambda status page provides an aggregated overview of all the Lambdas instrumented by Helios

To enable quicker troubleshooting and access to the full context of the invocations and errors - there are also links to the relevant traces in Helios so you can easily see E2E flows.

E2E trace visualization

Gaining intuitive visibility into an end-to-end flow with your Lambda functions is possible leveraging Helios' OpenTelemetry SDK and showing the data and appropriate context.

Lambda troubleshooting

Alerts

Users can customize and control exactly what data is important to them with each Lambda by leveraging the labels & alerts in Helios to save search queries and detect behaviors that are of interest. Customization can be done based on either applicative events, or Lambda metrics from AWS.

Setting a Lambda monitoring alert when an error occurs more than 3 times

Saving a label and configuring an alert for Lambda monitoring purposes

Automatic access to CloudWatch logs

For each span it's possible to retrieve the relevant logs from CloudWatch with a short click of a button.

Retrieving CloudWatch logs for each span directly from the trace and visualization

Retrieving CloudWatch logs directly from each span and trace

Flow replay

Similar to other types of traces in Helios, each Lambda trace offers several quick troubleshooting actions. A key action is the ability to replay a specific flow by automatically generating the code and then being able to configure it an use it to reproduce an issue, run it in a different environment, investigate the root cause, and finally verify that it's working properly.

Lambda aut-ogenerated flow replay code based on OpenTelemetry instrumentation

Auto-generated flow replay code for Lambda traces based on the Helios OpenTelemetry instrumentation