[Proposal] Python Policy Support: Enable ML/AI-Powered Policies via Python Runtime #1103

sehan-dissanayake · 2026-02-12T06:55:33Z

sehan-dissanayake
Feb 12, 2026

Summary

Add Python runtime support to the gateway, enabling policies to be written in Python alongside existing Go policies
Run Python policy executor as a third managed process within the unified gateway-runtime container
Use Unix Domain Socket (UDS) for Go Policy Engine ↔ Python Policy Executor communication (with TCP fallback option)
Enable ML/AI-powered policies by leveraging Python's rich ecosystem of libraries (scikit-learn, transformers, numpy, etc.)
Users can mix Go and Python policies in the same policy chain for a single API

Motivation

The API Platform gateway currently supports policies written exclusively in Go. While Go provides excellent performance and is well suited for high-throughput request processing, some policy use cases involving machine learning and AI are more naturally expressed in Python because of its extensive ML and AI ecosystem.

Key drivers:

AI Gateway Use Cases: As the platform evolves to support AI gateway capabilities, policies need access to ML/AI libraries for tasks. For example:
- Prompt compression and optimization (reducing token costs for LLM calls)
- Content moderation and safety filtering using ML models
- Semantic routing based on embeddings
- Response quality evaluation and scoring
Ecosystem Access: Python's rich ecosystem (transformers, langchain, numpy, scikit-learn, etc.) enables sophisticated policy logic that would be impractical to reimplement in Go
Developer Flexibility: Allow policy authors to choose the right tool for the job—Go for performance-critical policies, Python for ML/AI-driven logic
Faster Development and Iteration: Leverage existing Python-based ML pipelines and models without requiring Go rewrites

Example Use Case: Prompt Compression Policy

An AI gateway handling LLM requests can benefit from a prompt compression policy that:

Analyzes incoming prompts using NLP libraries
Applies compression techniques to reduce token count
Maintains semantic meaning while lowering API costs
Implemented in Python using libraries such as llmlingua or compression-prompt, which have no Go equivalents

Proposal

Extend the unified gateway-runtime container (which currently manages Router and Go Policy Engine as described in the Unified Gateway Container proposal) to include a Python Policy Executor as a third managed process. The three processes will run under tini as PID 1, with the existing entrypoint script extended to manage Python policy lifecycle in addition to Router and Go Policy Engine.

Architecture at a glance:

┌─────────────────────────────────────────────────────────┐
│ gateway-runtime container                               │
│                                                         │
│  tini (PID 1)                                           │
│    └─ docker-entrypoint.sh                              │
│         ├─ Router (Envoy)          [rtr]                │
│         ├─ Go Policy Engine        [pol]                │
│         └─ Python Policy Executor  [py-pol]  ← NEW      │
│                                                         │
│  /app/policy-engine.sock      ← Router ↔ Go PE          │
│  /app/python-policy.sock      ← Go PE ↔ Python PE       │
└─────────────────────────────────────────────────────────┘

Request flow:

Client sends API request to Router (Envoy) on port 8080/8443
Router invokes Go Policy Engine via ext_proc filter (UDS)
Go Policy Engine executes policy chain:
- Go policies execute directly within the Go runtime
- When a Python policy is encountered, Go Policy Engine delegates to Python Policy Executor via gRPC (UDS)
- Python Policy Executor loads and executes the Python policy logic
Policy results flow back: Python PE → Go PE → Router → Backend/Client

Architecture Diagrams

Process Tree

graph TD
    A[tini PID 1] --> B[docker-entrypoint.sh]
    B --> C[Router / Envoy]
    B --> D[Go Policy Engine]
    B --> E[Python Policy Executor]
    
    C -->|ext_proc gRPC<br/>UDS socket| D
    D -->|policy execution gRPC<br/>UDS socket| E
    
    style A fill:#666,stroke:#fff,stroke-width:2px,color:#fff
    style B fill:#666,stroke:#fff,stroke-width:2px,color:#fff
    style C fill:#6cf,stroke:#333,stroke-width:3px,color:#333
    style D fill:#9f9,stroke:#333,stroke-width:3px,color:#333
    style E fill:#ff9,stroke:#333,stroke-width:3px,color:#333

Request Flow Sequence

sequenceDiagram
    participant Client
    participant Router as Router<br/>(Envoy)
    participant GoPE as Go Policy Engine
    participant PyPE as Python Policy<br/>Executor
    participant Backend

    Client->>Router: HTTP Request
    Router->>GoPE: ext_proc: Request Headers
    
    Note over GoPE: Execute policy chain
    
    GoPE->>GoPE: Execute Go Policy 1<br/>(e.g., JWT Auth)
    
    GoPE->>PyPE: Execute Python Policy<br/>(e.g., Prompt Compression)
    activate PyPE
    PyPE->>PyPE: Load Python policy module
    PyPE->>PyPE: Execute policy logic<br/>(ML inference, transformations)
    PyPE-->>GoPE: Policy result + metadata
    deactivate PyPE
    
    GoPE->>GoPE: Execute Go Policy 2<br/>(e.g., Rate Limiting)
    
    GoPE-->>Router: Modified headers/body
    Router->>Backend: Forwarded request
    Backend-->>Router: Response
    Router->>GoPE: ext_proc: Response
    GoPE-->>Router: Response modifications
    Router-->>Client: HTTP Response

Changes Required

Core Components

gateway-runtime Container
- Add Python 3.10+ runtime to Dockerfile (with multi python version support if needed)
- Extend entrypoint script to manage Python Policy Executor as third process
- Create isolated venvs for each Python policy during build (dependency isolation)
Python Policy Executor (New Process)
- gRPC server managing Python policy subprocesses via ProcessPoolExecutor
- Communicates with Go Policy Engine over UDS (Unix Domain Socket)
- Executes policies by calling venv-specific Python interpreters directly
Go Policy Engine Extensions
- Detect Python policies in chain and delegate to Python PE via gRPC
- Handle policy results and integrate into chain execution
Gateway Builder
- Discover Python policies from manifest (similar to Go module discovery)
- Create virtual environments per policy with isolated dependencies
- Generate Python policy registry (discovery mechanism to be determined)

Implementation Details

Dependency Isolation via Virtual Environments

One venv per policy: Each Python policy gets its own virtual environment with isolated dependencies, preventing dependency conflicts between policies.

Build process (multi-stage Docker):

Builder stages: Default single builder stage for base Python version (e.g., FROM python:3.10-slim AS builder). If needed, add additional builder stages for other Python versions (e.g., FROM python:3.11-slim AS python311-builder)
Per-policy venv creation: For each policy, read python_version.txt and requirements.txt
- Create venv: python3.X -m venv /policies/<policy_name>/venv
- Install dependencies: /policies/<policy_name>/venv/bin/pip install -r requirements.txt
Runtime stage: Copy all policy directories with their pre-built venvs from builder stages using COPY --from=<builder> directives

Policy execution calls the venv's Python binary directly:

/policies/my-policy/venv/bin/python /policies/my-policy/policy.py

Multi-version support (optional): If policies specify different Python versions (3.10, 3.11, etc.), the builder creates appropriate venvs from the corresponding builder stage.

Process Model and Subprocess Management

Python Policy Executor Architecture:

Python process with embedded gRPC server (communicates with Go Policy Engine via UDS)
Uses ProcessPoolExecutor to manage Python subprocess execution
Each policy execution spawns subprocess calling the policy's venv-specific Python interpreter
Managed by Python Policy Executor, not tini: The individual Python policy subprocesses are spawned and managed by the Python Policy Executor's ProcessPoolExecutor, giving fine-grained control over policy execution lifecycle

Execution flow:

Go Policy Engine identifies Python policy in chain → sends gRPC request to Python Policy Executor
Python Policy Executor receives the request
Python Policy Executor spawns subprocess: subprocess.run([venv_python, policy_py], input=json_input, capture_output=True)
Policy executes in isolated subprocess with its venv dependencies
Python Policy Executor returns result to Go Policy Engine via gRPC response

Benefits of subprocess model:

Crash isolation: Policy crash doesn't affect Python Policy Executor or other policies
Resource isolation: Per-execution memory limits, no shared state between executions
Independent lifecycle: Policies start/stop independently, no daemon management required
Security: Each execution runs in isolated process space

Communication Protocol

Go Policy Engine ↔ Python Policy Executor communication uses gRPC over UDS:

Default: UDS at /app/python-policy.sock (lowest latency, same pattern as Router ↔ Go Policy Engine)
Fallback: TCP on port 9004 (for development/debugging)
Protocol: gRPC with protobuf for type safety and efficiency

Drawbacks and Trade-offs

Resource usage: Adding Python runtime increases container memory footprint
Startup latency: Python interpreter initialization and policy module loading add few seconds to container startup time
Performance considerations: Python policies will have higher per-request latency than Go policies due to:
- Language overhead (Python vs compiled Go)
- ML model inference time (varies by model complexity)
Dependency management: Python policies may have conflicting dependencies; requires careful environment isolation

Mitigation strategies:

Python support is opt-in: APIs that don't use Python policies don't pay the performance cost
UDS communication minimizes IPC overhead
Use Python policies strategically for AI/ML tasks where Go alternatives don't exist
Document performance characteristics and best practices clearly

Alternatives Considered

Alternative 1: Separate Python Policy Engine Container

Run Python Policy Executor as a sidecar container instead of embedding in gateway-runtime.

Pros: Independent scaling, clearer resource isolation
Cons: Inter-container network latency, more complex deployment (3 containers), harder debugging
Why rejected: UDS within the same container provides lower latency and keeps the proven unified container model.

Alternative 2: Embed Python Runtime in Go via Cgo

Embed Python interpreter directly using cgo bindings (e.g., go-python).

Pros: Single process, no IPC overhead
Cons: Cgo build complexity, Python GIL contention with goroutines, crash isolation difficult, Go codebase becomes complex
Why rejected: Process isolation benefits outweigh IPC cost with UDS.

Open Questions

Python policy repository and distribution: How should Python policies be stored and distributed?

Context: Go policies are stored as Go modules in GitHub because that's the native way Go modules are distributed and consumed (via go get). Should python follow its native distribution approach?

External Repository Options:
- Option A: Mono-repo with Go policies (gateway-controllers repo with python-policies/ subdirectory)
- Option B: Separate Python policy repository (e.g., gateway-python-policies)
- Option C: Distribute Python policies as pip packages (PyPI or private package index)
- Option D: Include in gateway repo (not external)
Discovery mechanism: Use Python build manifest (python-build.yaml) similar to Go's build.yaml, referencing either GitHub repos or pip packages.
Error handling: How should Python policy crashes be handled? Fail the request or skip the policy?
- Default: Fail the request (fail-safe); make configurable per-policy
Log prefix name: Should we use [py-pol] or any alternative?

sehan-dissanayake · 2026-02-12T07:05:07Z

sehan-dissanayake
Feb 12, 2026
Author

Performance Issue & Refinement

After discussions with @renuka-fernando and @HeshanSudarshana, we identified some issues with the proposed method:

Problem: Spawning a new Python process for each request is expensive and would cause significant latency.

Potential Approaches:

Always long-running policy processes as persistent daemons
Single shared Python process handling all policies

Dependency conflict concern:
Multiple policies with conflicting dependencies in a shared process environment could cause runtime issues. Virtual environments help at the filesystem level, but in-memory dependency conflicts require careful handling.

Need to find the optimal balance between performance, resource efficiency and maintainability.

0 replies

DinithHerath · 2026-02-12T12:19:53Z

DinithHerath
Feb 12, 2026
Collaborator

@sehan-dissanayake The proposal is solid overall.

One concern I had is around the disk footprint and runtime cost of maintaining a fully isolated virtual environment per Python policy, especially for ML-related dependencies (e.g., numpy, scikit-learn, torch), which can become quite large. As an example venv with above dependencies may scale upto ~800mb which in theory result in lot of disk space being consumed as generally users use lot of polices)

As a possible refinement, we could consider a hybrid environment approach:

A shared base Python environment with commonly used dependencies that most policies can target by default.
Opt-in dedicated environments only for policies that require conflicting or uncommon dependencies.

In addition, aligning with the performance discussion already noted, using long-running Python worker processes instead of spawning a new process per request could help reduce latency.

This might provide a better balance between isolation, performance, and resource usage.

0 replies

sehan-dissanayake · 2026-02-24T04:36:32Z

sehan-dissanayake
Feb 24, 2026
Author

Implementation Update: Python Policy Support

The Python policy support basic functionality has been implemented. Here's how it actually landed. A few things changed from the original proposal.

What Changed: Venv-per-Policy → Merged `pip install --target`

The original proposal called for one virtual environment per Python policy for dependency isolation. We dropped this in favor of a simpler merged requirements.txt + pip install --target approach. Here's why:

Image bloat: Each venv duplicates the Python standard library and pip tooling. With 5 policies, that's ~200–300 MB of redundant copies inside the container. A single shared install target keeps the image lean.
Build complexity: Creating N venvs in a multi-stage Docker build means N pip install invocations, each potentially compiling C extensions (grpcio, numpy, etc.) — slow and fragile. One merged install is straightforward.
Startup overhead: The Python Executor loads all policies into a single process. Separate venvs would require sys.path juggling per import, risking subtle path resolution bugs.
Practical conflict risk is low: Gateway policies are curated at build time via build.yaml. The operator controls which policies ship together, so dependency conflicts are caught during the build, not at runtime.

If true isolation becomes necessary in the future, we can revisit with container-level separation or a per-policy subprocess model. For now, the merged approach is correct for the use case.

How It Actually Works

The Go Side: Minimal Bridge, Not a Rewrite

The Go Policy Engine was not significantly modified. The key insight: a Python policy looks exactly like a Go policy to the rest of the system. This is achieved through a bridge pattern in a single package — pythonbridge/.

The bridge has four files:

File	Role
`factory.go`	`BridgeFactory` — conforms to `policy.PolicyFactory`, same as Go policy factories. Returns a `PythonBridge` instance.
`bridge.go`	`PythonBridge` — implements the `policy.Policy` interface (`OnRequest`, `OnResponse`, `Mode`). Translates Go SDK types → protobuf, calls the stream manager, translates protobuf → Go SDK types back.
`client.go`	`StreamManager` — singleton gRPC client. Maintains one persistent bidirectional stream to the Python Executor over UDS. Multiplexes concurrent requests via `request_id` correlation.
`translator.go`	Proto ↔ Go SDK type conversion.

The policy engine's core chain execution is untouched. It iterates the policy chain, calls OnRequest()/OnResponse() on each policy — whether that's a native Go policy or a PythonBridge is invisible to it. The bridge just serializes the call to protobuf, sends it over the gRPC stream, waits for the correlated response, and translates back.

xDS updates are consumed entirely by Go. When the controller pushes a route update, Go's buildPolicyChain() instantiates a PythonBridge per Python policy (holding params, metadata, processing mode as Go structs) — exactly the same lifecycle as Go-native policy instances. The Python Executor is never notified about xDS events. Python doesn't know what routes exist or what params are configured until an actual execution request arrives.

The Python Side: Async gRPC Server + Thread Pool

The Python Executor (python-executor/) is a standalone async gRPC server:

Python Executor Process
├── asyncio event loop (main thread)
│   └── gRPC async server on unix:///var/run/api-platform/python-executor.sock
│       └── ExecuteStream handler (bidi streaming)
│           └── For each request: asyncio.run_in_executor(thread_pool, execute_policy)
├── ThreadPoolExecutor (default 4 workers, configurable via PYTHON_POLICY_WORKERS)
│   ├── Worker thread 1 → policy.on_request() / policy.on_response()
│   ├── Worker thread 2 → ...
│   ├── Worker thread 3 → ...
│   └── Worker thread 4 → ...
├── PolicyLoader — imports policies from generated registry at startup
├── PolicyCache — lazy, content-addressed: key = (name, version, sha256(params))
└── Metrics HTTP server (port 9119)

Why a thread pool, not subprocesses? The original proposal suggested ProcessPoolExecutor with subprocess spawning per execution. That was dropped because:

Subprocess spawn overhead per request is too high for latency-sensitive API traffic
Policy instances need to stay alive across requests (model loading, connection pools, etc.)
Thread pool gives us concurrency with shared policy instances — a loaded ML model is initialized once and reused

The Python Policy SDK

The SDK mirrors the Go policy/v1alpha interface:

class Policy(ABC):
    def __init__(self, metadata: PolicyMetadata, params: Dict[str, Any]): ...

    @abstractmethod
    def on_request(self, ctx: RequestContext, params: Dict) -> RequestAction: ...

    @abstractmethod
    def on_response(self, ctx: ResponseContext, params: Dict) -> ResponseAction: ...

Action types (UpstreamRequestModifications, ImmediateResponse, UpstreamResponseModifications) are dataclasses that match Go SDK equivalents. A policy author writes a policy.py with a get_policy(metadata, params) factory function — same pattern as Go.

Mixed Go + Python Policy Chain: The Bridge Pattern

The central design point: Python policies are wrapped in Go. The chain executor never knows it's talking to Python — every policy in the chain implements the same Go policy.Policy interface. State (params, metadata, shared context) lives entirely on the Go side. Python is a stateless execution backend.

Here's a example route /foo with three policies — jwt-auth (Go), prompt-compress (Python), rate-limit (Go):

graph TB
    subgraph GoPE["Go Policy Engine Process — all state lives here"]
        direction TB
        
        CE["Chain Executor<br/>iterates []policy.Policy"]
        
        subgraph chain["Route /foo — Policy Chain"]
            direction LR
            
            subgraph p1["Policy 1"]
                P1I["<b>jwt-auth</b><br/><i>Go native</i>"]
                P1T["implements policy.Policy"]
                P1S["State: params, metadata,<br/>shared context"]
            end
            subgraph p2["Policy 2"]
                P2I["<b>PythonBridge</b><br/><i>wraps prompt-compress</i>"]
                P2T["implements policy.Policy"]
                P2S["State: params, metadata,<br/>shared context, mode"]
            end
            subgraph p3["Policy 3"]
                P3I["<b>rate-limit</b><br/><i>Go native</i>"]
                P3T["implements policy.Policy"]
                P3S["State: params, metadata,<br/>shared context"]
            end
        end
        SM["StreamManager (singleton)<br/>persistent bidi gRPC stream<br/>multiplexed via request_id"]
    end
    subgraph PyExec["Python Executor Process — stateless execution"]
        direction TB
        GRPC["async gRPC server<br/>UDS: python-executor.sock"]
        TP["ThreadPoolExecutor<br/>(4 workers)"]
        PC["PolicyCache<br/>lazy, content-addressed"]
        
        subgraph workers["Worker Threads"]
            W1["Thread 1<br/>prompt-compress.on_request()"]
            W2["Thread 2<br/>(available)"]
            W3["Thread 3<br/>(available)"]
            W4["Thread 4<br/>(available)"]
        end
    end
    CE -->|"1. OnRequest(ctx, params)"| P1I
    CE -->|"2. OnRequest(ctx, params)"| P2I
    CE -->|"3. OnRequest(ctx, params)"| P3I
    P2I -->|"serialize to protobuf<br/>+ request_id"| SM
    SM -->|"gRPC over UDS"| GRPC
    GRPC --> TP
    TP --> W1
    W1 -.->|"ExecutionResponse<br/>(correlated by request_id)"| SM
    SM -.->|"translate proto → Go SDK<br/>merge metadata back"| P2I
    style GoPE fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#1b5e20
    style PyExec fill:#fff8e1,stroke:#f9a825,stroke-width:2px,color:#e65100
    style p1 fill:#c8e6c9,stroke:#66bb6a,color:#1b5e20
    style p2 fill:#ffe0b2,stroke:#ffa726,stroke-width:2px,color:#bf360c
    style p3 fill:#c8e6c9,stroke:#66bb6a,color:#1b5e20
    style chain fill:#f1f8e9,stroke:#aed581,color:#33691e
    style SM fill:#e3f2fd,stroke:#42a5f5,color:#0d47a1
    style workers fill:#fff9c4,stroke:#ffee58,color:#f57f17

Key things to notice:

All three policies implement policy.Policy. The chain executor calls OnRequest(ctx, params) on each one identically. It doesn't know or care that Policy 2 is Python.
PythonBridge holds the state — params, metadata, shared context, processing mode. These are Go structs. Python never stores per-route state; it receives context as protobuf, executes, and returns a result.
StreamManager is a singleton shared by all PythonBridge instances across all routes. It maintains one persistent gRPC bidi stream and multiplexes via request_id — so 100 concurrent requests share one stream, no connection-per-call overhead.
Go owns the full policy lifecycle. xDS updates create/destroy PythonBridge instances in Go. Python is never notified — it only sees execution calls, never configuration events.

Build-Time Integration

Python policies are declared in build.yaml alongside Go policies:

policies:
  # Go policies — remote module reference
  - name: jwt-auth
    gomodule: github.com/wso2/gateway-policies/jwt-auth/v0.1.0

  # Python policies — remote module reference (analogous to gomodule)
  - name: prompt-compress
    pythonmodule: github.com/wso2/gateway-python-policies/[email protected]

  # filePath — local/dev policies only (both Go and Python)
  - name: my-local-policy
    filePath: ./dev-policies/my-local-policy

The gateway-builder detects runtime: python in the policy's policy-definition.yaml and:

Copies the policy source into the build output under python-executor/policies/
Generates a python_policy_registry.py mapping "name:version" → "policies.module.policy"
Merges all requirements.txt files (base executor deps + per-policy deps) into one
On the Go side, generates a BridgeFactory registration instead of a Go plugin registration

In the Dockerfile runtime stage, a single pip3 install --target /app/python-libs installs all merged dependencies. The Python Executor is started conditionally — only if main.py exists in the image (it's always copied, but if there are zero Python policies, the entrypoint skips launching it).

Container Process Model

Same as proposed — three processes under tini, managed by the entrypoint script:

tini (PID 1)
└── docker-entrypoint.sh
    ├── python3 main.py          [pye]  ← started first, waits for UDS socket
    ├── policy-engine             [pol]  ← started second, connects to Python Executor
    └── envoy                     [rtr]  ← started last, connects to Policy Engine

Startup is sequential (Python Executor → Policy Engine → Envoy) with socket readiness checks. If any process dies, the entrypoint tears down the rest and exits.

0 replies

sehan-dissanayake · 2026-02-25T04:05:41Z

sehan-dissanayake
Feb 25, 2026
Author

Proto File

syntax = "proto3";

package wso2.gateway.python.v1;

option go_package = "github.com/wso2/api-platform/gateway/gateway-runtime/policy-engine/internal/pythonbridge/proto";

import "google/protobuf/struct.proto";

// PythonExecutorService defines the gRPC contract between Go PE and the Python process.
// The Python process is the gRPC SERVER, Go PE is the CLIENT.
service PythonExecutorService {
  // Bidirectional stream for executing policies.
  // Go sends ExecutionRequest, Python responds with ExecutionResponse.
  // Each request has a unique request_id that the response must echo back.
  rpc ExecuteStream (stream ExecutionRequest) returns (stream ExecutionResponse);

  // Health check for readiness.
  rpc HealthCheck (HealthCheckRequest) returns (HealthCheckResponse);
}

// ---------------------- Request / Response ----------------------

message ExecutionRequest {
  // Unique ID per call so responses can be correlated on the single stream.
  string request_id = 1;

  // Policy to execute (name:version format used for cache lookup, e.g., "my-policy:v1")
  string policy_name = 2;
  string policy_version = 3;

  // Phase: "on_request" or "on_response"
  string phase = 4;

  // Merged parameters (system + user) for this policy instance.
  // The Go side resolves ${config} references in systemParameters and merges
  // them with user parameters before sending. Python never sees raw ${config} strings.
  google.protobuf.Struct params = 5;

  // The request or response context data
  oneof context {
    RequestContext request_context = 6;
    ResponseContext response_context = 7;
  }

  // Shared context (metadata, API info, auth context)
  SharedContext shared_context = 8;

  // Policy metadata (route info, API info) for factory creation
  PolicyMetadata policy_metadata = 9;
}

message ExecutionResponse {
  // Must match request_id from the corresponding ExecutionRequest
  string request_id = 1;

  oneof result {
    RequestActionResult request_result = 2;
    ResponseActionResult response_result = 3;
    ExecutionError error = 4;
  }

  // Updated shared metadata (Python may have mutated it).
  // Go side merges this back into the SharedContext.
  google.protobuf.Struct updated_metadata = 5;
}

// ---------------------- Context Messages ----------------------

message SharedContext {
  string project_id = 1;
  string request_id = 2;
  google.protobuf.Struct metadata = 3;     // Inter-policy communication map
  string api_id = 4;
  string api_name = 5;
  string api_version = 6;
  string api_kind = 7;
  string api_context = 8;
  string operation_path = 9;
  map<string, string> auth_context = 10;
}

message RequestContext {
  map<string, string> headers = 1;
  bytes body = 2;
  bool body_present = 3;
  bool end_of_stream = 4;
  string path = 5;
  string method = 6;
  string authority = 7;
  string scheme = 8;
}

message ResponseContext {
  // Original request data (immutable)
  map<string, string> request_headers = 1;
  bytes request_body = 2;
  string request_path = 3;
  string request_method = 4;

  // Response data
  map<string, string> response_headers = 5;
  bytes response_body = 6;
  bool response_body_present = 7;
  int32 response_status = 8;
}

message PolicyMetadata {
  string route_name = 1;
  string api_id = 2;
  string api_name = 3;
  string api_version = 4;
  string attached_to = 5;   // "api" or "route"
}

// ---------------------- Action Results ----------------------

message RequestActionResult {
  oneof action {
    UpstreamRequestModifications continue_request = 1;
    ImmediateResponseAction immediate_response = 2;
  }
}

message ResponseActionResult {
  oneof action {
    UpstreamResponseModifications continue_response = 1;
  }
}

message UpstreamRequestModifications {
  map<string, string> set_headers = 1;
  repeated string remove_headers = 2;
  map<string, StringList> append_headers = 3;
  bytes body = 4;
  bool body_present = 5;              // false means no body change, true means use body field (even if empty)
  string path = 6;
  bool path_present = 7;
  string method = 8;
  bool method_present = 9;
  google.protobuf.Struct analytics_metadata = 10;
}

message UpstreamResponseModifications {
  map<string, string> set_headers = 1;
  repeated string remove_headers = 2;
  map<string, StringList> append_headers = 3;
  bytes body = 4;
  bool body_present = 5;
  int32 status_code = 6;
  bool status_code_present = 7;
  google.protobuf.Struct analytics_metadata = 8;
}

message ImmediateResponseAction {
  int32 status_code = 1;
  map<string, string> headers = 2;
  bytes body = 3;
  google.protobuf.Struct analytics_metadata = 4;
}

message ExecutionError {
  string message = 1;
  string policy_name = 2;
  string policy_version = 3;
  string error_type = 4;    // "init_error", "execution_error", "timeout"
}

// ---------------------- Health Check ----------------------

message HealthCheckRequest {}

message HealthCheckResponse {
  bool ready = 1;
  int32 loaded_policies = 2;
}

// ---------------------- Utility ----------------------

message StringList {
  repeated string values = 1;
}

0 replies

renuka-fernando · 2026-03-03T05:31:09Z

renuka-fernando
Mar 3, 2026
Collaborator

Policy Instance Lifecycle Gap in the Python Bridge

The current Python bridge design describes Python as a "stateless execution backend" — but Go policies are not stateless. The PolicyFactory pattern (discussion #395) gives each Go policy full control over its instancing strategy, and several production policies rely on this to cache expensive state across requests. The bridge needs to support these same patterns for Python policies, or clearly scope which policy types Python can implement.

How Go Policy Lifecycle Works Today

Every Go policy exports a GetPolicy factory (PolicyFactory):

type PolicyFactory func(metadata PolicyMetadata, params map[string]interface{}) (Policy, error)

The factory receives PolicyMetadata (including RouteName, APIName, APIVersion, AttachedTo) and merged params. The policy engine calls this factory during chain construction (triggered by xDS route updates). The factory controls its own instancing — it decides whether to return a singleton, a cached instance, or a fresh instance. Three patterns exist in production:

Pattern	Cache Key	State Preserved	Examples
Singleton	None (global `var ins`)	None — params parsed each request	`add-headers`, `cors`, `set-headers`
Cached by config	`hash(params)` or `hash(initParams)`	Parsed config, JWKS keys, HTTP clients, embedding providers	`jwt-auth`, `semantic-cache`
Cached by route	`metadata.RouteName`	Per-route counters, per-route limiter state	`advanced-ratelimit`

Concrete examples from gateway-controllers/policies/:

Rate limiting — caches per-route limiter instances in a globalLimiterCache with reference counting. When GetPolicy is called for a route, it looks up baseCacheKey (derived from routeName + algorithm + params hash). If a limiter exists, it reuses it (preserving counter state across xDS reloads). If not, it creates a new one. When a route is removed and its GetPolicy is no longer called, ref count drops and the limiter is eventually cleaned up.

Semantic cache — creates a new instance per call to GetPolicy, initializing an embedding provider and vector store provider (expensive operations: connection setup, index creation). The instance holds provider references for the lifetime of the route.

JWT auth — returns a singleton but maintains an internal cacheStore map[string]*CachedJWKS protected by sync.RWMutex. JWKS public keys are fetched once per issuer and cached with TTL across all routes.

The Gap in the Python Bridge

The bridge as described creates a PythonBridge on the Go side per route (correct — same lifecycle as any Go policy). But on the Python side:

PolicyCache keys on (name, version, sha256(params)) — this works for singleton and cached-by-config patterns, but breaks for per-route state. If two routes configure the same Python policy with identical params, they resolve to the same Python instance and share state incorrectly. A rate limiter on route /api/v1/pets would share counters with /api/v2/users.
No lifecycle signals from Go to Python. In Go, when an xDS update removes or reconfigures a route:
- Old policy instances are dereferenced and GC'd
- New instances are created via GetPolicy with fresh or cached state
- advanced-ratelimit uses ref-counting in globalLimiterCache to clean up stale limiters
Additionally, #855 / discussion #734 adds an explicit Close() error method to the Go Policy interface. The Kernel will call Close() asynchronously (with timeout + worker pool) on all policies in a chain when that route is removed or replaced. This gives Go policies a deterministic cleanup hook for connections, goroutines, and caches — e.g., SemanticCachePolicy.Close() releases vector store and embedding provider connections, rate limiting closes Redis clients.

Python has no equivalent of either mechanism. There's no DestroyPolicy RPC. The PolicyCache has no eviction mechanism tied to route lifecycle. Stale instances (holding model weights, connection pools, counters) accumulate indefinitely. When Close() lands on the Go side, PythonBridge.Close() will need to propagate the signal to the Python executor to trigger cleanup on the Python instance — but the current proto has no RPC for this.
No factory-controlled instancing. In Go, the factory decides the caching strategy — the engine just calls GetPolicy and trusts the result. In Python, the executor's PolicyCache imposes a fixed caching strategy from outside the policy. The policy author has no say in whether instances are shared or isolated.

Proposed Solution: Lifecycle RPCs with Factory-Controlled Instancing

The fix is to add lifecycle RPCs to the proto that mirror the Go policy lifecycle exactly. In Go, the two lifecycle entry points are GetPolicy() (create/cache instance) and Close() (cleanup on removal, from #855 / discussion #734). The Python bridge needs the same two signals:

InitPolicy — the GetPolicy() equivalent. Called when Go builds a policy chain (xDS route update). Python executor receives metadata + params, calls get_policy() factory, and the factory controls its own instancing (singleton, cached-by-config, or per-route — same as Go).
DestroyPolicy — the Close() equivalent. Called when Go tears down a policy chain (route removed/replaced). Python executor calls close() on the instance and removes it from cache.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Python Policy Support: Enable ML/AI-Powered Policies via Python Runtime #1103

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

[Proposal] Python Policy Support: Enable ML/AI-Powered Policies via Python Runtime #1103

Uh oh!

sehan-dissanayake Feb 12, 2026

Summary

Motivation

Proposal

Architecture Diagrams

Process Tree

Request Flow Sequence

Changes Required

Core Components

Implementation Details

Dependency Isolation via Virtual Environments

Process Model and Subprocess Management

Communication Protocol

Drawbacks and Trade-offs

Alternatives Considered

Alternative 1: Separate Python Policy Engine Container

Alternative 2: Embed Python Runtime in Go via Cgo

Open Questions

Replies: 5 comments

Uh oh!

sehan-dissanayake Feb 12, 2026 Author

Performance Issue & Refinement

Uh oh!

Uh oh!

DinithHerath Feb 12, 2026 Collaborator

Uh oh!

sehan-dissanayake Feb 24, 2026 Author

Implementation Update: Python Policy Support

What Changed: Venv-per-Policy → Merged pip install --target

How It Actually Works

The Go Side: Minimal Bridge, Not a Rewrite

The Python Side: Async gRPC Server + Thread Pool

The Python Policy SDK

Mixed Go + Python Policy Chain: The Bridge Pattern

Build-Time Integration

Container Process Model

Uh oh!

sehan-dissanayake Feb 25, 2026 Author

Proto File

Uh oh!

Uh oh!

renuka-fernando Mar 3, 2026 Collaborator

Policy Instance Lifecycle Gap in the Python Bridge

How Go Policy Lifecycle Works Today

The Gap in the Python Bridge

Proposed Solution: Lifecycle RPCs with Factory-Controlled Instancing

sehan-dissanayake
Feb 12, 2026

sehan-dissanayake
Feb 12, 2026
Author

DinithHerath
Feb 12, 2026
Collaborator

sehan-dissanayake
Feb 24, 2026
Author

What Changed: Venv-per-Policy → Merged `pip install --target`

sehan-dissanayake
Feb 25, 2026
Author

renuka-fernando
Mar 3, 2026
Collaborator