Cache
OPA Caching from Policy
Caching of OPA responses was added in M16 to handle cases where OPA was undersized in terms of memory requests and number of replicas (pods). Caching should not be needed if OPA is sized large enough for your load, however you'll likely get better performance with caching enabled.
When search makes calls to Policy Service Translate API, Policy Service can make 1-2 calls to OPA per search request. There are also other APIs in policy that make calls to OPA and there are services that talk to OPA directly like search.
See sizing for more details on OPA Sizing recommendations.
How to disable Policy Service Cache
To disable caching set environment variable DISABLE_OPA_CACHE to True.
Caching will also be disabled if the OPA_CACHE_MAXSIZE is 0
Adjusting Policy Service Cache Settings
Caching of OPA responses can be controlled by the following environments variables. Shown with defaults:
OPA_CACHE_MAXSIZE=128
OPA_CACHE_TTL=120
OPA_LIST_CACHE_MAXSIZE=1
OPA_LIST_CACHE_TTL=60
OPA_FETCH_CACHE_MAXSIZE (by default inherits value of OPA_CACHE_MAXSIZE)
OPA_FETCH_CACHE_TTL=60
OPA_COMPILE_CACHE_TTL (by default inherits value of OPA_CACHE_TTL)
OPA_DATAAPI_CACHE_MAXSIZE (by default inherits value of OPA_CACHE_MAXSIZE)
OPA_DATAAPI_CACHE_TTL (by default inherits value of OPA_CACHE_TTL)
Info
Prior to M20, to tweak cache settings required container image build with different settings in conf.py Details on how your policy service is configured, including cache settings, can be retrieved via the /config API
Policy List
Cache list of policies available, only cache latest result
OPA_LIST_CACHE_MAXSIZE=1
OPA_LIST_CACHE_TTL=60
Policies Cache
Cache policies themselves from OPA
OPA_FETCH_CACHE_MAXSIZE=OPA_CACHE_MAXSIZE
OPA_FETCH_CACHE_TTL=60
Compile and Data API Cache
Cache compile and dataapi results from OPA (used by translate/search)
OPA_COMPILE_CACHE_MAXSIZE=OPA_CACHE_MAXSIZE
OPA_COMPILE_CACHE_TTL=OPA_CACHE_TTL
OPA_DATAAPI_CACHE_MAXSIZE=OPA_CACHE_MAXSIZE
OPA_DATAAPI_CACHE_TTL=OPA_CACHE_TTL
Cache Statics
Enabling any of the below will add CacheInfo (hits, misses, maxsize, and currsize) to config API output. Note that cache_info will inflict a - probably minor - performance penalty, so it has to be explicitly enabled to policy service.
OPA_DATA_CACHE_INFO
OPA_COMPILE_CACHE_INFO
OPA_DOCUMENT_CACHE_INFO
OPA_FETCH_CACHE_INFO
OPA_LIST_CACHE_INFO
Caching within Policies
To use caching with rego policies in http.send you can use cache or force_cache.
If you use force_cache, please note this overrides cache directives defined by the OPA service. So you'll want to set force_cache_duration_seconds.
cache_ignored_headers
If a cache entry exists with a subset/superset of headers that are considered in this request, it will lead to a cache miss. Please consider using cache_ignored_headers when using caching. Please note, cache_ignored_headers was added in v0.66 release of OPA.
OPA Cache Settings
Set OPA default cache settings
If you are using cache settings in policies it is strongly recommended to consider using these OPA config settings(not in Rego).
caching.inter_query_builtin_cache.max_size_bytesto set a Inter-query cache size limit in bytes. OPA will drop old items from the cache if this limit is exceeded. By default, no limit is set (i.e. unlimited) and this could cause Out of MemoryOOMKilledandMemoryPressureissues.caching.inter_query_builtin_cache.forced_eviction_threshold_percentage- Threshold limit configured as percentage of caching.inter_query_builtin_cache.max_size_bytes, when exceeded OPA will start dropping old items permaturely. By default, set to 100.caching.inter_query_builtin_cache.stale_entry_eviction_period_seconds- Stale entry eviction period in seconds. OPA will drop expired items from the cache every stale_entry_eviction_period_seconds. By default, set to 0 indicating stale entry eviction is disabled.
See sizing for additional details related to these settings.
Entitlements Caching
The Policy Service implements caching for user entitlements to improve performance of the /translate endpoint. The cache stores entitlement group data retrieved from the OSDU entitlements service, reducing API calls and response times for subsequent requests.
Architecture
The entitlements caching system uses a provider-based architecture that automatically selects the appropriate cache implementation based on the CLOUD_PROVIDER environment variable:
- AWS: Uses ElastiCache/Valkey (Redis-compatible) with VM cache fallback
- Azure/GC/Baremetal/IBM: Uses VM cache (in-memory caching with TTL)
Cache Key Generation
Cache keys are generated using the pattern entitlement-groups:{partition_id}:{authorization} and are hashed using CRC32C for consistency with the Search service implementation, enabling shared cache hits across services, and ensuring cache is isolated by user and partition.
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
CACHE_EXPIRATION_SECONDS |
60 |
TTL for cached entries in seconds |
CACHE_MAXIMUM_SIZE |
100 |
Maximum number of entries in VM cache (only) |
CLOUD_PROVIDER |
Required | Determines cache provider (aws, azure, gc, ibm, baremetal) |
Implementation Details
Redis Implementation
- Shared cache between Search and Policy pods
- SSL/TLS encryption enabled
- Automatic fallback to VM cache on connection failure
- Namespace support for key isolation
- SCAN-based key deletion for performance (currently unused)
VM Cache Implementation
Implements using the Python cachetools library:
- In-memory storage with automatic expiration
- LRU eviction when maximum size is reached
- Thread-safe operations
- No persistence across service restarts
Provider Implementation Guide
To implement Redis caching for your provider:
- Copy AWS Redis implementation i.e. copy
/app/cache/providers/aws/group_cache.pyand/app/cache/providers/aws/cache_parameters.pyto/app/cache/providers/{provider}/ - Configure Cache Parameters: Edit
/app/cache/providers/{provider}/cache_parameters.pyto configure your instances Redis credentials
Cache Behavior
Cache Hit Flow
- Request arrives at
/translateendpoint - Cache key generated from
data-partition-idandAuthorizationheaders - Cache checked for existing entitlement data
- If found, cached data returned immediately
Cache Miss Flow
- Request made to entitlements service API
- Response cached with configured TTL
- Data returned to translate endpoint
Error Handling
- Cache connection failures automatically fall back to VM cache
- Cache read/write errors are logged but don't block API requests
- Service continues to function normally without caching if all cache backends fail
Performance Impact
Entitlements caching provides significant performance improvements:
- Avoids entitlements service API calls for repeated requests
- Decreases
/translateendpoint response time - Improves overall system resilience by reducing external service dependencies and implemented exponential backoff retries