OpenTelemetry: WARNING: Instrument has recorded multiple values for the same attributes

Context

Hello!

We are using Airbyte 0.39.42-alpha with Docker Compose, and are setting it up to send metrics using OpenTelemetry, using information from the following documentation and threads:

According to the documentation, we have updated the Docker Compose stack to:

  • setup the airbyte-metrics-reporter service for OpenTelemetry
  • setup the airbyte-worker service for OpenTelemetry
  • setup the opentelemetry-collector service to handle OTEL gRPC calls, and expose metrics using the Prometheus exporter

Additionally, we have setup:

  • Prometheus to scrape data from opentelemetry-collector
  • Grafana to display Prometheus metrics

Issue

When the airbyte-metrics-reporter-service emits metrics using the OpenTelemetry SDK, the following warning can be seen:

airbyte-metrics-reporter  | Aug 08, 2022 3:21:29 PM io.opentelemetry.sdk.internal.ThrottlingLogger doLog
airbyte-metrics-reporter  | WARNING: Instrument oldest_running_job_age_secs has recorded multiple values for the same attributes.
airbyte-metrics-reporter  | Aug 08, 2022 3:21:29 PM io.opentelemetry.sdk.internal.ThrottlingLogger doLog
airbyte-metrics-reporter  | WARNING: Instrument num_running_jobs has recorded multiple values for the same attributes.
airbyte-metrics-reporter  | Aug 08, 2022 3:21:29 PM io.opentelemetry.sdk.internal.ThrottlingLogger doLog
airbyte-metrics-reporter  | WARNING: Instrument oldest_pending_job_age_secs has recorded multiple values for the same attributes.

When sync jobs are running, the gauges corresponding to the number of pending and running jobs do not seem to be updated accordingly, e.g. with two sync jobs running:

$ curl --silent http://localhost:8889/metrics | rg 'num_running'

# HELP airbyte_num_running_jobs number of running jobs
# TYPE airbyte_num_running_jobs gauge
airbyte_num_running_jobs{job="metrics-reporter"} 0

This issue seems to be limited to gauge values, as counters are correctly incremented:

Configuration details

Please find the (curated) configuration related to OpenTelemetry that we used for the different services:


.env

VERSION=0.39.42-alpha
PUBLISH_METRICS="true"
METRIC_CLIENT=otel
OTEL_COLLECTOR_ENDPOINT="http://otel-collector:4317"

docker-compose.yml

services:
  worker:
    environment:
      - PUBLISH_METRICS=${PUBLISH_METRICS}
      - METRIC_CLIENT=${METRIC_CLIENT}
      - OTEL_COLLECTOR_ENDPOINT=${OTEL_COLLECTOR_ENDPOINT}

  airbyte-metrics-reporter:
    image: airbyte/metrics-reporter:${VERSION}
    logging: *default-logging
    container_name: airbyte-metrics-reporter
    environment:
      - CONFIG_DATABASE_PASSWORD=${CONFIG_DATABASE_PASSWORD:-}
      - CONFIG_DATABASE_URL=${CONFIG_DATABASE_URL:-}
      - CONFIG_DATABASE_USER=${CONFIG_DATABASE_USER:-}
      - CONFIGS_DATABASE_MINIMUM_FLYWAY_MIGRATION_VERSION=${CONFIGS_DATABASE_MINIMUM_FLYWAY_MIGRATION_VERSION:-}
      - CONFIG_ROOT=${CONFIG_ROOT}
      - DATABASE_PASSWORD=${DATABASE_PASSWORD}
      - DATABASE_URL=jdbc:postgresql://${DATABASE_HOST}:${DATABASE_PORT}/${DATABASE_DB}
      - DATABASE_USER=${DATABASE_USER}
      - PUBLISH_METRICS=${PUBLISH_METRICS}
      - METRIC_CLIENT=${METRIC_CLIENT}
      - OTEL_COLLECTOR_ENDPOINT=${OTEL_COLLECTOR_ENDPOINT}

  otel-collector:
    image: otel/opentelemetry-collector:0.57.2
    command: ["--config=/etc/otel-collector-config.yaml"]
    ports:
      - "8888:8888"   # Prometheus metrics exposed by the collector
      - "8889:8889"   # Prometheus exporter metrics
    volumes:
      - ./otel-collector/otel-collector-config.yaml:/etc/otel-collector-config.yaml

otel-collector/otel-collector-config.yaml

---
receivers:
  otlp:
    protocols:
      grpc: {}

processors:
  batch: {}

exporters:
  logging: {}
  prometheus:
    endpoint: 0.0.0.0:8889
    namespace: airbyte
    send_timestamps: true
    metric_expiration: 60m

extensions:
  health_check:
  pprof:
  zpages:

service:
  extensions: [health_check, pprof, zpages]
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [logging, prometheus]

Attempts
After seeing the following issue being fixed on the OTEL SDK:

I tried bumping the version of the SDK to 1.16 using Airbyte’s deps.toml and rebuilding the Docker image for airbyte-metrics-reporter:

$ git clone https://github.com/airbytehq/airbyte
$ cd airbyte
$ vim deps.toml    # set OTEL SDK version to 1.16.0
$ cd airbyte-metrics/reporter
$ ../../gradlew build

but observed the same behaviour: warning messages, gauges stuck to 0.

The following discussion may provide better insights as to why the emission of the latest value fails for Airbyte gauges:

Please let me know if you need more information to reproduce the issue, I’ll also be happy to contribute fixes :slight_smile:

Thanks,

Aurélien

Hey @virtualtam, we really appreciate this post. I made an issue relating to your question on Github, please add your thoughts and follow the discussion over there!

1 Like

Hi @sajarin , thanks for following up!

I’ll head over to Github to continue the discussion :+1:

For anyone facing similar behaviour with OpenTelemetry metrics collection, the corresponding issue is airbytehq/airbyte#15623 - OpenTelemetry: WARNING: Instrument has recorded multiple values for the same attributes