Skip to content

Observability and OpenTelemetry Configuration

The Policy Service includes comprehensive observability features using OpenTelemetry for distributed tracing and metrics collection.

OpenTelemetry Integration

The service uses manual OpenTelemetry instrumentation to provide detailed tracing of:

  • HTTP requests (FastAPI endpoints)
  • External service calls (Entitlements, OPA Agent)
  • Redis cache operations
  • Cloud storage operations (policy bundle downloads)

Configuration

OpenTelemetry is configured via environment variables:

Variable Description Example
OTEL_SERVICE_NAME Service name for traces os-policy
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT Traces endpoint http://otel-collector:4318/v1/traces
OTEL_EXPORTER_OTLP_PROTOCOL Export protocol http/protobuf
OTEL_PROPAGATORS Trace propagators tracecontext,baggage,b3
OTEL_RESOURCE_ATTRIBUTES Resource attributes service.name=os-policy,deployment.environment=kubernetes

Automatic Instrumentation

The service automatically instruments:

  • FastAPI: All HTTP endpoints and middleware
  • Requests: HTTP calls to external services (Entitlements, OPA)
  • Redis: Cache operations (get, set, delete)
  • URLLib3: Low-level HTTP operations

Manual Tracing Setup

Tracing is initialized in app/tracing.py:

Monitoring

Observability Platform Integration

When deployed on Kubernetes, traces can be sent to various observability platforms via OpenTelemetry collectors, providing: - Service maps showing request flows - Performance insights and bottlenecks - Error tracking and alerting - Distributed tracing across microservices

Troubleshooting

Common Issues

  1. Missing traces: Ensure OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is correctly configured
  2. Service not identified properly: Verify OTEL_RESOURCE_ATTRIBUTES includes appropriate environment tags
  3. Cache operations not traced: Check Redis instrumentation is enabled in tracing.py

OPA Agent Distributed Tracing

The OPA Agent can be configured for distributed tracing using its native OpenTelemetry support. For complete configuration options, see the OPA distributed tracing documentation.

OPA Configuration

Add the following to the OPA configuration file:

distributed_tracing:
  type: grpc                    # or "http"
  address: otel-collector:4317  # gRPC endpoint (4318 for HTTP)
  service_name: opa-agent
  sample_percentage: 100        # Percentage of traces to sample
  resource:
    service_namespace: "policy-system"
    deployment_environment: "kubernetes"

OPA Troubleshooting

  1. No OPA traces: Verify distributed_tracing section in OPA config
  2. Connection errors: Check collector endpoint accessibility
  3. Missing spans: Ensure OPA version supports distributed tracing (v0.40.0+)