Skip to content

Cache

OPA Caching from Policy

Caching of OPA responses was added in M16 to handle cases where OPA was undersized in terms of memory requests and number of replicas (pods). Caching should not be needed if OPA is sized large enough for your load, however you'll likely get better performance with caching enabled.

When search makes calls to Policy Service Translate API, Policy Service can make 1-2 calls to OPA per search request. There are also other APIs in policy that make calls to OPA and there are services that talk to OPA directly like search.

See sizing for more details on OPA Sizing recommendations.

How to disable Policy Service Cache

To disable caching set environment variable DISABLE_OPA_CACHE to True. Caching will also be disabled if the OPA_CACHE_MAXSIZE is 0

Adjusting Policy Service Cache Settings

Caching of OPA responses can be controlled by the following environments variables. Shown with defaults:

OPA_CACHE_MAXSIZE=128
OPA_CACHE_TTL=120
OPA_LIST_CACHE_MAXSIZE=1
OPA_LIST_CACHE_TTL=60
OPA_FETCH_CACHE_MAXSIZE (by default inherits value of OPA_CACHE_MAXSIZE)
OPA_FETCH_CACHE_TTL=60
OPA_COMPILE_CACHE_TTL (by default inherits value of OPA_CACHE_TTL)
OPA_DATAAPI_CACHE_MAXSIZE (by default inherits value of OPA_CACHE_MAXSIZE)
OPA_DATAAPI_CACHE_TTL (by default inherits value of OPA_CACHE_TTL)

Info

Prior to M20, to tweak cache settings required container image build with different settings in conf.py Details on how your policy service is configured, including cache settings, can be retrieved via the /config API

Policy List

Cache list of policies available, only cache latest result

OPA_LIST_CACHE_MAXSIZE=1
OPA_LIST_CACHE_TTL=60

Policies Cache

Cache policies themselves from OPA

OPA_FETCH_CACHE_MAXSIZE=OPA_CACHE_MAXSIZE
OPA_FETCH_CACHE_TTL=60

Compile and Data API Cache

Cache compile and dataapi results from OPA (used by translate/search)

OPA_COMPILE_CACHE_MAXSIZE=OPA_CACHE_MAXSIZE
OPA_COMPILE_CACHE_TTL=OPA_CACHE_TTL
OPA_DATAAPI_CACHE_MAXSIZE=OPA_CACHE_MAXSIZE
OPA_DATAAPI_CACHE_TTL=OPA_CACHE_TTL

Cache Statics

Enabling any of the below will add CacheInfo (hits, misses, maxsize, and currsize) to config API output. Note that cache_info will inflict a - probably minor - performance penalty, so it has to be explicitly enabled to policy service.

OPA_DATA_CACHE_INFO
OPA_COMPILE_CACHE_INFO 
OPA_DOCUMENT_CACHE_INFO 
OPA_FETCH_CACHE_INFO
OPA_LIST_CACHE_INFO

Caching within Policies

To use caching with rego policies in http.send you can use cache or force_cache.

If you use force_cache, please note this overrides cache directives defined by the OPA service. So you'll want to set force_cache_duration_seconds.

cache_ignored_headers

If a cache entry exists with a subset/superset of headers that are considered in this request, it will lead to a cache miss. Please consider using cache_ignored_headers when using caching. Please note, cache_ignored_headers was added in v0.66 release of OPA.

OPA Cache Settings

Set OPA default cache settings

If you are using cache settings in policies it is strongly recommended to consider using these OPA config settings(not in Rego).

  • caching.inter_query_builtin_cache.max_size_bytes to set a Inter-query cache size limit in bytes. OPA will drop old items from the cache if this limit is exceeded. By default, no limit is set (i.e. unlimited) and this could cause Out of Memory OOMKilled and MemoryPressure issues.
  • caching.inter_query_builtin_cache.forced_eviction_threshold_percentage - Threshold limit configured as percentage of caching.inter_query_builtin_cache.max_size_bytes, when exceeded OPA will start dropping old items permaturely. By default, set to 100.
  • caching.inter_query_builtin_cache.stale_entry_eviction_period_seconds - Stale entry eviction period in seconds. OPA will drop expired items from the cache every stale_entry_eviction_period_seconds. By default, set to 0 indicating stale entry eviction is disabled.

See sizing for additional details related to these settings.

Entitlements Caching

The Policy Service implements caching for user entitlements to improve performance of the /translate endpoint. The cache stores entitlement group data retrieved from the OSDU entitlements service, reducing API calls and response times for subsequent requests.

Architecture

The entitlements caching system uses a provider-based architecture that automatically selects the appropriate cache implementation based on the CLOUD_PROVIDER environment variable:

  • AWS: Uses ElastiCache/Valkey (Redis-compatible) with VM cache fallback
  • Azure/GC/Baremetal/IBM: Uses VM cache (in-memory caching with TTL)

Cache Key Generation

Cache keys are generated using the pattern entitlement-groups:{partition_id}:{authorization} and are hashed using CRC32C for consistency with the Search service implementation, enabling shared cache hits across services, and ensuring cache is isolated by user and partition.

Configuration

Environment Variables

Variable Default Description
CACHE_EXPIRATION_SECONDS 60 TTL for cached entries in seconds
CACHE_MAXIMUM_SIZE 100 Maximum number of entries in VM cache (only)
CLOUD_PROVIDER Required Determines cache provider (aws, azure, gc, ibm, baremetal)

Implementation Details

Redis Implementation

  • Shared cache between Search and Policy pods
  • SSL/TLS encryption enabled
  • Automatic fallback to VM cache on connection failure
  • Namespace support for key isolation
  • SCAN-based key deletion for performance (currently unused)

VM Cache Implementation

Implements using the Python cachetools library:

  • In-memory storage with automatic expiration
  • LRU eviction when maximum size is reached
  • Thread-safe operations
  • No persistence across service restarts

Provider Implementation Guide

To implement Redis caching for your provider:

  1. Copy AWS Redis implementation i.e. copy /app/cache/providers/aws/group_cache.py and /app/cache/providers/aws/cache_parameters.py to /app/cache/providers/{provider}/
  2. Configure Cache Parameters: Edit /app/cache/providers/{provider}/cache_parameters.py to configure your instances Redis credentials

Cache Behavior

Cache Hit Flow

  1. Request arrives at /translate endpoint
  2. Cache key generated from data-partition-id and Authorization headers
  3. Cache checked for existing entitlement data
  4. If found, cached data returned immediately

Cache Miss Flow

  1. Request made to entitlements service API
  2. Response cached with configured TTL
  3. Data returned to translate endpoint

Error Handling

  • Cache connection failures automatically fall back to VM cache
  • Cache read/write errors are logged but don't block API requests
  • Service continues to function normally without caching if all cache backends fail

Performance Impact

Entitlements caching provides significant performance improvements:

  • Avoids entitlements service API calls for repeated requests
  • Decreases /translate endpoint response time
  • Improves overall system resilience by reducing external service dependencies and implemented exponential backoff retries