Back to Blog
Engineering

Deploying Decentralized Clean Rooms on Kubernetes

18 min read
Placino Infrastructure Team

Deploying a self-hosted data clean room platform requires orchestrating dozens of microservices, stateful components, and network isolation policies. This guide covers the production deployment of Placino on Kubernetes—from initial architecture decisions through hardening for air-gapped environments.

Prerequisites and Architecture Overview

Placino's architecture consists of 35 Docker containers distributed across 10 Go microservices, 8 infrastructure components, 8 observability services, a React frontend, and an API gateway. Three isolated Docker networks enforce data separation at the container level: the frontend network hosts the React application and Kong gateway, the backend network runs core microservices, and the data network isolates PostgreSQL, ClickHouse, Kafka, Valkey, and MinIO.

Core Services and Ports

The microservices operate on dedicated internal ports:

// Authentication and Gateway
auth-service:8060           # Token generation, SAML/OIDC validation
kong-api-gateway:8000       # Public API endpoint
kong-admin:8001             # Kong administration API

// Core Data Operations
ingestion-service:8010      # Data ingestion, deduplication, normalization
matching-service:8020       # Entity resolution, fuzzy matching
query-service:8030          # Federated SQL execution
governance-service:8040     # Consent, retention, lineage management
audit-service:8050          # Event logging, compliance audit trail
catalog-service:8070        # Data asset discovery, metadata

// AI and ML Pipeline
ai-proxy-service:8090       # LLM integration, model serving
audience-service:8089       # Behavioral segmentation, scoring
ml-service:8088             # Model training, inference

// Infrastructure
opa-sidecar:8181            # Open Policy Agent enforcement
prometheus:9090             # Metrics collection
grafana:3000                # Metrics visualization

Kubernetes Prerequisites

  • Kubernetes 1.24+ with RBAC enabled
  • Helm 3.12+ for chart management
  • Container runtime: containerd or Docker with overlay2 storage driver
  • Minimum cluster resources: 8 CPU cores, 32GB memory (development); 24 cores, 128GB (production)
  • Storage class provisioner (EBS, GCE PD, local-path, or NFS)
  • Network policies supported at CNI layer (Calico, Cilium, or native)

Namespace and Network Policy Design

Kubernetes namespaces provide administrative boundaries, but true isolation requires network policies. Placino's three-network architecture translates into separate namespaces with strict ingress/egress rules.

Namespace Segmentation

apiVersion: v1
kind: Namespace
metadata:
  name: placino-frontend
  labels:
    app.kubernetes.io/name: placino
    app.kubernetes.io/component: frontend
---
apiVersion: v1
kind: Namespace
metadata:
  name: placino-backend
  labels:
    app.kubernetes.io/name: placino
    app.kubernetes.io/component: backend
---
apiVersion: v1
kind: Namespace
metadata:
  name: placino-data
  labels:
    app.kubernetes.io/name: placino
    app.kubernetes.io/component: data
---
apiVersion: v1
kind: Namespace
metadata:
  name: placino-observability
  labels:
    app.kubernetes.io/name: placino
    app.kubernetes.io/component: observability

Network Policies

Network policies define allowed traffic flows. The frontend namespace permits ingress from external sources but only connects to Kong's admin API in the backend namespace. Backend services can reach data layer components but no cross-service communication within the data namespace is allowed beyond infrastructure requirements.

# Frontend namespace: permit ingress, egress to backend only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-egress-backend
  namespace: placino-frontend
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: placino-backend
    ports:
    - protocol: TCP
      port: 8000  # Kong gateway
    - protocol: TCP
      port: 8001  # Kong admin
  - to:
    - podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
---
# Backend namespace: permit ingress from frontend, egress to data only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-data-isolation
  namespace: placino-backend
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: placino-frontend
    ports:
    - protocol: TCP
      port: 8000
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: placino-data
    ports:
    - protocol: TCP
      port: 5432   # PostgreSQL
    - protocol: TCP
      port: 8123   # ClickHouse
    - protocol: TCP
      port: 9092   # Kafka
    - protocol: TCP
      port: 6379   # Valkey
    - protocol: TCP
      port: 9000   # MinIO
  - to:
    - podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
---
# Data namespace: deny all egress except DNS and inter-data services
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: data-network-isolation
  namespace: placino-data
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector: {}
    ports:
    - protocol: TCP
      port: 5432
    - protocol: TCP
      port: 8123
    - protocol: TCP
      port: 9092
    - protocol: TCP
      port: 6379
    - protocol: TCP
      port: 9000
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53

Helm Chart Structure

A monolithic Helm chart simplifies upgrades and cross-namespace dependencies. The chart structure organizes resources by concern: one subtemplate per namespace, shared secrets, and configurable values for each microservice.

placino-helm/
├── Chart.yaml
├── values.yaml
├── values-dev.yaml
├── values-prod.yaml
├── templates/
│   ├── _helpers.tpl
│   ├── namespace.yaml
│   ├── network-policies.yaml
│   ├── secrets/
│   │   ├── vault-auth.yaml
│   │   ├── tls-certs.yaml
│   │   └── database-credentials.yaml
│   ├── frontend/
│   │   ├── react-deployment.yaml
│   │   ├── react-service.yaml
│   │   └── react-configmap.yaml
│   ├── kong/
│   │   ├── kong-deployment.yaml
│   │   ├── kong-service.yaml
│   │   ├── kong-configmap.yaml
│   │   └── kong-ingress.yaml
│   ├── backend/
│   │   ├── auth-deployment.yaml
│   │   ├── ingestion-deployment.yaml
│   │   ├── matching-deployment.yaml
│   │   ├── query-deployment.yaml
│   │   ├── governance-deployment.yaml
│   │   ├── audit-deployment.yaml
│   │   ├── catalog-deployment.yaml
│   │   ├── ai-proxy-deployment.yaml
│   │   ├── audience-deployment.yaml
│   │   ├── ml-deployment.yaml
│   │   ├── opa-sidecar-configmap.yaml
│   │   └── backend-services.yaml
│   ├── data/
│   │   ├── postgres-deployment.yaml
│   │   ├── postgres-pvc.yaml
│   │   ├── clickhouse-deployment.yaml
│   │   ├── clickhouse-pvc.yaml
│   │   ├── kafka-statefulset.yaml
│   │   ├── valkey-deployment.yaml
│   │   ├── minio-deployment.yaml
│   │   ├── minio-pvc.yaml
│   │   └── data-services.yaml
│   ├── observability/
│   │   ├── prometheus-deployment.yaml
│   │   ├── prometheus-configmap.yaml
│   │   ├── grafana-deployment.yaml
│   │   ├── grafana-configmap.yaml
│   │   └── observability-services.yaml
│   ├── vault/
│   │   ├── vault-deployment.yaml
│   │   ├── vault-configmap.yaml
│   │   └── vault-service.yaml
│   ├── rbac/
│   │   ├── service-accounts.yaml
│   │   ├── role-bindings.yaml
│   │   └── pod-security-policies.yaml
│   └── ingress.yaml
└── charts/
    └── (external chart dependencies if using subchart pattern)

values.yaml Structure

replicaCount: 3
environment: production

image:
  registry: gcr.io
  repository: placino-images
  tag: "1.4.0"
  pullPolicy: IfNotPresent

frontend:
  react:
    replicas: 3
    resources:
      requests:
        memory: "256Mi"
        cpu: "100m"
      limits:
        memory: "512Mi"
        cpu: "500m"
    env:
      API_ENDPOINT: "https://api.placino.local"
      LOG_LEVEL: "info"

kong:
  replicas: 3
  resources:
    requests:
      memory: "512Mi"
      cpu: "250m"
    limits:
      memory: "1Gi"
      cpu: "1000m"
  config:
    database: postgresql
    dns_resolver: kube-dns
    upstream_keepalive: 60

backend:
  auth:
    replicas: 2
    resources:
      requests:
        memory: "512Mi"
        cpu: "250m"
      limits:
        memory: "1Gi"
        cpu: "1000m"
    env:
      AUTH_TIMEOUT: "30s"
      TOKEN_TTL: "3600"
      SAML_ENABLED: "true"

  ingestion:
    replicas: 3
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "2Gi"
        cpu: "2000m"
    batchSize: 10000
    deduplicationWindow: "24h"

  matching:
    replicas: 2
    resources:
      requests:
        memory: "2Gi"
        cpu: "1000m"
      limits:
        memory: "4Gi"
        cpu: "2000m"
    algorithm: "fuzzy-leven"
    confidenceThreshold: 0.85

  query:
    replicas: 3
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "3Gi"
        cpu: "2000m"
    queryTimeout: "300s"
    maxConcurrency: 50

  governance:
    replicas: 2
    resources:
      requests:
        memory: "512Mi"
        cpu: "250m"
      limits:
        memory: "1Gi"
        cpu: "1000m"

  audit:
    replicas: 2
    resources:
      requests:
        memory: "512Mi"
        cpu: "250m"
      limits:
        memory: "1Gi"
        cpu: "1000m"
    retention: "90d"

  catalog:
    replicas: 2
    resources:
      requests:
        memory: "512Mi"
        cpu: "250m"
      limits:
        memory: "1Gi"
        cpu: "1000m"

  aiProxy:
    replicas: 2
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "2Gi"
        cpu: "2000m"
    modelProvider: "openai"
    modelName: "gpt-4"

  audience:
    replicas: 2
    resources:
      requests:
        memory: "2Gi"
        cpu: "1000m"
      limits:
        memory: "4Gi"
        cpu: "2000m"

  ml:
    replicas: 2
    resources:
      requests:
        memory: "2Gi"
        cpu: "1000m"
      limits:
        memory: "4Gi"
        cpu: "2000m"

data:
  postgres:
    replicas: 1
    resources:
      requests:
        memory: "4Gi"
        cpu: "2000m"
      limits:
        memory: "8Gi"
        cpu: "4000m"
    storage: 100Gi
    storageClass: "fast-ssd"
    maxConnections: 200
    sharedBuffers: "2GB"
    effectiveCacheSize: "6GB"

  clickhouse:
    replicas: 1
    resources:
      requests:
        memory: "8Gi"
        cpu: "4000m"
      limits:
        memory: "16Gi"
        cpu: "8000m"
    storage: 500Gi
    storageClass: "fast-ssd"
    maxMemoryUsage: "10Gi"

  kafka:
    replicas: 3
    resources:
      requests:
        memory: "2Gi"
        cpu: "1000m"
      limits:
        memory: "4Gi"
        cpu: "2000m"
    storage: 200Gi
    storageClass: "standard"
    retentionMs: 604800000

  valkey:
    replicas: 1
    resources:
      requests:
        memory: "2Gi"
        cpu: "500m"
      limits:
        memory: "4Gi"
        cpu: "1000m"
    maxmemory: "3gb"
    maxmemoryPolicy: "allkeys-lru"

  minio:
    replicas: 4
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "2Gi"
        cpu: "1000m"
    storage: 1Ti
    storageClass: "fast-ssd"

observability:
  prometheus:
    replicas: 2
    storage: 100Gi
    storageClass: "fast-ssd"
    retention: "15d"
    resources:
      requests:
        memory: "2Gi"
        cpu: "500m"
      limits:
        memory: "4Gi"
        cpu: "2000m"

  grafana:
    replicas: 2
    adminPassword: "" # Set via secrets
    storage: 10Gi
    storageClass: "standard"
    resources:
      requests:
        memory: "256Mi"
        cpu: "100m"
      limits:
        memory: "512Mi"
        cpu: "500m"

vault:
  enabled: true
  replicas: 3
  storage: 10Gi
  storageClass: "fast-ssd"
  resources:
    requests:
      memory: "512Mi"
      cpu: "250m"
    limits:
      memory: "1Gi"
      cpu: "1000m"

ingress:
  enabled: true
  className: "nginx"
  tls:
    enabled: true
    issuer: "letsencrypt-prod"
  hosts:
    - host: "placino.local"
      paths:
        - path: /
          pathType: Prefix

affinity:
  podAntiAffinity: preferred

tolerations: []
nodeSelector: {}

Configuring Persistent Storage

Stateful components require careful storage configuration. PostgreSQL, ClickHouse, and MinIO use different I/O patterns and should be provisioned accordingly.

Storage Classes

# Fast SSD for databases
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
  type: io1
  iops: "1000"
  fstype: ext4
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
---
# Standard storage for Kafka, observability
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  fstype: ext4
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

PostgreSQL Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
  namespace: placino-data
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15-alpine
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_DB
          value: placino
        - name: POSTGRES_USER
          valueFrom:
            secretKeyRef:
              name: postgres-credentials
              key: username
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-credentials
              key: password
        - name: POSTGRES_INITDB_ARGS
          value: "-c max_connections=200 -c shared_buffers=2GB -c effective_cache_size=6GB -c maintenance_work_mem=512MB -c checkpoint_completion_target=0.9 -c wal_buffers=16MB -c default_statistics_target=100 -c random_page_cost=1.1 -c effective_io_concurrency=200 -c work_mem=10485kB -c min_wal_size=4GB -c max_wal_size=16GB"
        resources:
          requests:
            memory: "4Gi"
            cpu: "2000m"
          limits:
            memory: "8Gi"
            cpu: "4000m"
        volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
        livenessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - pg_isready -U $POSTGRES_USER
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - pg_isready -U $POSTGRES_USER
          initialDelaySeconds: 5
          periodSeconds: 10
      volumes:
      - name: postgres-storage
        persistentVolumeClaim:
          claimName: postgres-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
  namespace: placino-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 100Gi
---
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: placino-data
spec:
  type: ClusterIP
  ports:
  - port: 5432
    targetPort: 5432
  selector:
    app: postgres

ClickHouse Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: clickhouse
  namespace: placino-data
spec:
  replicas: 1
  selector:
    matchLabels:
      app: clickhouse
  template:
    metadata:
      labels:
        app: clickhouse
    spec:
      containers:
      - name: clickhouse
        image: clickhouse/clickhouse-server:latest
        ports:
        - name: http
          containerPort: 8123
        - name: tcp
          containerPort: 9000
        env:
        - name: CLICKHOUSE_DB
          value: placino
        resources:
          requests:
            memory: "8Gi"
            cpu: "4000m"
          limits:
            memory: "16Gi"
            cpu: "8000m"
        volumeMounts:
        - name: clickhouse-storage
          mountPath: /var/lib/clickhouse
        - name: clickhouse-config
          mountPath: /etc/clickhouse-server
        livenessProbe:
          httpGet:
            path: /ping
            port: 8123
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ping
            port: 8123
          initialDelaySeconds: 5
          periodSeconds: 10
      volumes:
      - name: clickhouse-storage
        persistentVolumeClaim:
          claimName: clickhouse-pvc
      - name: clickhouse-config
        configMap:
          name: clickhouse-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: clickhouse-config
  namespace: placino-data
data:
  config.xml: |
    <clickhouse>
      <logger>
        <level>information</level>
      </logger>
      <http_port>8123</http_port>
      <tcp_port>9000</tcp_port>
      <max_memory_usage>10737418240</max_memory_usage>
      <max_memory_usage_for_user>10737418240</max_memory_usage_for_user>
    </clickhouse>
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: clickhouse-pvc
  namespace: placino-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 500Gi
---
apiVersion: v1
kind: Service
metadata:
  name: clickhouse
  namespace: placino-data
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 8123
    targetPort: 8123
  - name: tcp
    port: 9000
    targetPort: 9000
  selector:
    app: clickhouse

MinIO Multi-Node Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: minio
  namespace: placino-data
spec:
  replicas: 4
  selector:
    matchLabels:
      app: minio
  template:
    metadata:
      labels:
        app: minio
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - minio
            topologyKey: kubernetes.io/hostname
      containers:
      - name: minio
        image: minio/minio:latest
        command:
        - minio
        - server
        - http://minio-0.minio.placino-data.svc.cluster.local/data
        - http://minio-1.minio.placino-data.svc.cluster.local/data
        - http://minio-2.minio.placino-data.svc.cluster.local/data
        - http://minio-3.minio.placino-data.svc.cluster.local/data
        ports:
        - containerPort: 9000
          name: minio
        - containerPort: 9001
          name: console
        env:
        - name: MINIO_ROOT_USER
          valueFrom:
            secretKeyRef:
              name: minio-credentials
              key: username
        - name: MINIO_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: minio-credentials
              key: password
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
        volumeMounts:
        - name: minio-storage
          mountPath: /data
        livenessProbe:
          httpGet:
            path: /minio/health/live
            port: 9000
          initialDelaySeconds: 30
          periodSeconds: 20
        readinessProbe:
          httpGet:
            path: /minio/health/ready
            port: 9000
          initialDelaySeconds: 5
          periodSeconds: 10
      volumes:
      - name: minio-storage
        persistentVolumeClaim:
          claimName: minio-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: minio-pvc
  namespace: placino-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 1Ti
---
apiVersion: v1
kind: Service
metadata:
  name: minio
  namespace: placino-data
spec:
  clusterIP: None
  ports:
  - port: 9000
    targetPort: 9000
  - port: 9001
    targetPort: 9001
  selector:
    app: minio

Secrets Management with Vault

Kubernetes Secrets are base64-encoded, not encrypted at rest by default. HashiCorp Vault provides dynamic secret generation, rotation, and audit logging. Placino integrates Vault for database credentials, API keys, TLS certificates, and encryption keys.

Vault Configuration

# Vault Auth Method: Kubernetes
vault auth enable kubernetes

vault write auth/kubernetes/config   token_reviewer_jwt=@/var/run/secrets/kubernetes.io/serviceaccount/token   kubernetes_host=https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT   kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

# Database Secret Engine
vault secrets enable database

vault write database/config/postgres   plugin_name=postgresql-database-plugin   allowed_roles="placino-role"   connection_url="postgresql://{{username}}:{{password}}@postgres.placino-data:5432/placino"   username="vault_admin"   password="$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)"

vault write database/roles/placino-role   db_name=postgres   creation_statements="CREATE USER "{{name}}" WITH PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT CONNECT ON DATABASE placino TO "{{name}}"; GRANT USAGE ON SCHEMA public TO "{{name}}"; GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO "{{name}}";"   default_ttl="1h"   max_ttl="24h"

# KV Secret Engine
vault secrets enable -version=2 kv
vault kv put kv/placino/minio   access_key="$(openssl rand -base64 32)"   secret_key="$(openssl rand -base64 32)"

vault kv put kv/placino/tls   cert=@/path/to/cert.pem   key=@/path/to/key.pem

# Policies
vault policy write placino-policy - <<EOF
path "database/creds/placino-role" {
  capabilities = ["read"]
}
path "kv/data/placino/*" {
  capabilities = ["read", "list"]
}
path "transit/encrypt/placino" {
  capabilities = ["update"]
}
path "transit/compute/placino" {
  capabilities = ["update"]
}
EOF

# Kubernetes Auth Role
vault write auth/kubernetes/role/placino   bound_service_account_names=placino-backend,placino-data   bound_service_account_namespaces=placino-backend,placino-data   policies=placino-policy   ttl=24h

Vault Agent Injector

apiVersion: apps/v1
kind: Deployment
metadata:
  name: auth-service
  namespace: placino-backend
spec:
  replicas: 2
  selector:
    matchLabels:
      app: auth
  template:
    metadata:
      labels:
        app: auth
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "placino"
        vault.hashicorp.com/agent-inject-secret-database: "database/creds/placino-role"
        vault.hashicorp.com/agent-inject-template-database: |
          {{- with secret "database/creds/placino-role" -}}
          export DB_USER="{{ .Data.data.username }}"
          export DB_PASSWORD="{{ .Data.data.password }}"
          {{- end }}
        vault.hashicorp.com/agent-inject-secret-minio: "kv/data/placino/minio"
        vault.hashicorp.com/agent-inject-template-minio: |
          {{- with secret "kv/data/placino/minio" -}}
          export MINIO_ACCESS_KEY="{{ .Data.data.access_key }}"
          export MINIO_SECRET_KEY="{{ .Data.data.secret_key }}"
          {{- end }}
    spec:
      serviceAccountName: placino-backend
      containers:
      - name: auth
        image: placino/auth:1.4.0
        ports:
        - containerPort: 8060
        env:
        - name: DATABASE_URL
          value: "postgresql://$(DB_USER):$(DB_PASSWORD)@postgres.placino-data:5432/placino"
        - name: VAULT_ADDR
          value: "http://vault.placino-data:8200"
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"

Kong API Gateway Configuration

Kong acts as the single entry point for all external API traffic. It handles authentication (OAuth2, API keys), rate limiting, request/response transformation, and routing to backend services.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kong
  namespace: placino-frontend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: kong
  template:
    metadata:
      labels:
        app: kong
    spec:
      containers:
      - name: kong
        image: kong:3.4-alpine
        ports:
        - name: proxy
          containerPort: 8000
        - name: admin
          containerPort: 8001
        env:
        - name: KONG_DATABASE
          value: "postgres"
        - name: KONG_PG_HOST
          value: "postgres.placino-data"
        - name: KONG_PG_PORT
          value: "5432"
        - name: KONG_PG_USER
          valueFrom:
            secretKeyRef:
              name: kong-db-creds
              key: username
        - name: KONG_PG_PASSWORD
          valueFrom:
            secretKeyRef:
              name: kong-db-creds
              key: password
        - name: KONG_PG_DATABASE
          value: "kong"
        - name: KONG_PROXY_ACCESS_LOG
          value: "/dev/stdout"
        - name: KONG_PROXY_ERROR_LOG
          value: "/dev/stderr"
        - name: KONG_ADMIN_ACCESS_LOG
          value: "/dev/stdout"
        - name: KONG_ADMIN_ERROR_LOG
          value: "/dev/stderr"
        - name: KONG_ADMIN_LISTEN
          value: "0.0.0.0:8001"
        - name: KONG_DNS_RESOLVER
          value: "kube-dns"
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /status
            port: 8001
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /status
            port: 8001
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: kong-proxy
  namespace: placino-frontend
spec:
  type: LoadBalancer
  ports:
  - name: proxy
    port: 80
    targetPort: 8000
    protocol: TCP
  - name: proxy-tls
    port: 443
    targetPort: 8443
    protocol: TCP
  selector:
    app: kong
---
apiVersion: v1
kind: Service
metadata:
  name: kong-admin
  namespace: placino-frontend
spec:
  type: ClusterIP
  ports:
  - port: 8001
    targetPort: 8001
  selector:
    app: kong
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: kong-routes
  namespace: placino-frontend
data:
  routes.sh: |
    #!/bin/bash
    KONG_ADMIN_URL="http://kong-admin:8001"

    # Upstream services
    curl -X POST "$KONG_ADMIN_URL/upstreams"       --data name=auth-upstream
    curl -X POST "$KONG_ADMIN_URL/upstreams/auth-upstream/targets"       --data "target=auth-service.placino-backend:8060"

    curl -X POST "$KONG_ADMIN_URL/upstreams"       --data name=ingestion-upstream
    curl -X POST "$KONG_ADMIN_URL/upstreams/ingestion-upstream/targets"       --data "target=ingestion-service.placino-backend:8010"

    curl -X POST "$KONG_ADMIN_URL/upstreams"       --data name=matching-upstream
    curl -X POST "$KONG_ADMIN_URL/upstreams/matching-upstream/targets"       --data "target=matching-service.placino-backend:8020"

    curl -X POST "$KONG_ADMIN_URL/upstreams"       --data name=query-upstream
    curl -X POST "$KONG_ADMIN_URL/upstreams/query-upstream/targets"       --data "target=query-service.placino-backend:8030"

    curl -X POST "$KONG_ADMIN_URL/upstreams"       --data name=governance-upstream
    curl -X POST "$KONG_ADMIN_URL/upstreams/governance-upstream/targets"       --data "target=governance-service.placino-backend:8040"

    curl -X POST "$KONG_ADMIN_URL/upstreams"       --data name=audit-upstream
    curl -X POST "$KONG_ADMIN_URL/upstreams/audit-upstream/targets"       --data "target=audit-service.placino-backend:8050"

    curl -X POST "$KONG_ADMIN_URL/upstreams"       --data name=catalog-upstream
    curl -X POST "$KONG_ADMIN_URL/upstreams/catalog-upstream/targets"       --data "target=catalog-service.placino-backend:8070"

    # Services
    curl -X POST "$KONG_ADMIN_URL/services"       --data "name=auth"       --data "host=auth-upstream"       --data "port=8060"       --data "protocol=http"

    curl -X POST "$KONG_ADMIN_URL/services"       --data "name=ingestion"       --data "host=ingestion-upstream"       --data "port=8010"       --data "protocol=http"

    # Routes
    curl -X POST "$KONG_ADMIN_URL/services/auth/routes"       --data "paths[]=/auth"       --data "methods[]=POST"       --data "methods[]=GET"

    curl -X POST "$KONG_ADMIN_URL/services/ingestion/routes"       --data "paths[]=/ingest"       --data "methods[]=POST"

    # Plugins
    curl -X POST "$KONG_ADMIN_URL/plugins"       --data "name=oauth2"       --data "config.enable_authorization_code=true"       --data "config.provision_key=$(uuidgen)"

    curl -X POST "$KONG_ADMIN_URL/plugins"       --data "name=rate-limiting"       --data "config.minute=1000"       --data "config.hour=10000"

    curl -X POST "$KONG_ADMIN_URL/plugins"       --data "name=request-transformer"       --data "config.add.headers[]=X-Request-ID: $(uuidgen)"

    curl -X POST "$KONG_ADMIN_URL/plugins"       --data "name=cors"       --data "config.origins[]=*"       --data "config.methods[]=GET,POST,PUT,DELETE"

OPA Sidecar Deployment

Open Policy Agent (OPA) enforces fine-grained access control policies at the application layer. Each backend service runs an OPA sidecar that evaluates incoming requests against policy rules before processing.

apiVersion: v1
kind: ConfigMap
metadata:
  name: opa-policies
  namespace: placino-backend
data:
  data.rego: |
    package placino.data

    import data.placino.tokens
    import data.placino.roles

    # Decision function
    default allow = false

    allow {
      input.method == "GET"
      tokens.valid[input.token]
      roles.can_read[input.user_role]
    }

    allow {
      input.method == "POST"
      tokens.valid[input.token]
      roles.can_write[input.user_role]
    }

    allow {
      input.method == "DELETE"
      tokens.valid[input.token]
      roles.is_admin[input.user_role]
    }

  tokens.rego: |
    package placino.tokens

    valid[token] {
      token := input.token
      token != ""
      not token_revoked[token]
    }

    token_revoked[revoked_token] {
      revoked_tokens := ["token123", "token456"]
      revoked_token := revoked_tokens[_]
    }

  roles.rego: |
    package placino.roles

    can_read[role] {
      role := "viewer"
    }

    can_read[role] {
      role := "editor"
    }

    can_read[role] {
      role := "admin"
    }

    can_write[role] {
      role := "editor"
    }

    can_write[role] {
      role := "admin"
    }

    is_admin[role] {
      role := "admin"
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: query-service
  namespace: placino-backend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: query
  template:
    metadata:
      labels:
        app: query
    spec:
      containers:
      - name: query
        image: placino/query:1.4.0
        ports:
        - containerPort: 8030
        env:
        - name: OPA_ADDR
          value: "http://localhost:8181"
        - name: LOG_LEVEL
          value: "info"
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "3Gi"
            cpu: "2000m"
      - name: opa
        image: openpolicyagent/opa:latest
        args:
        - "run"
        - "--server"
        - "--addr=localhost:8181"
        - "--log-level=info"
        - "/policies"
        ports:
        - containerPort: 8181
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        volumeMounts:
        - name: opa-policies
          mountPath: /policies
        livenessProbe:
          httpGet:
            path: /health
            port: 8181
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: opa-policies
        configMap:
          name: opa-policies

Horizontal Scaling Patterns

Stateless services (auth, ingestion, matching, query) scale horizontally via Kubernetes Deployments and Horizontal Pod Autoscalers. Kafka provides queueing for decoupled scaling. Stateful services (PostgreSQL, ClickHouse) require different patterns.

Horizontal Pod Autoscaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: query-hpa
  namespace: placino-backend
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: query-service
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30
      - type: Pods
        value: 2
        periodSeconds: 30
      selectPolicy: Max
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ingestion-hpa
  namespace: placino-backend
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ingestion-service
  minReplicas: 3
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 85
  - type: Pods
    pods:
      metric:
        name: kafka_consumer_lag_seconds
      target:
        type: AverageValue
        averageValue: "60"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: matching-hpa
  namespace: placino-backend
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: matching-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 85

Pod Disruption Budgets

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: query-pdb
  namespace: placino-backend
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: query
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: ingestion-pdb
  namespace: placino-backend
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: ingestion
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: postgres-pdb
  namespace: placino-data
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: postgres
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: kafka-pdb
  namespace: placino-data
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: kafka

Observability Stack (Prometheus, Grafana)

Production Kubernetes clusters require metric collection, visualization, and alerting. Prometheus scrapes metrics from Kubernetes API servers, kubelet, and application endpoints. Grafana provides dashboards. AlertManager routes alerts based on severity.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  namespace: placino-observability
spec:
  replicas: 2
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus
      containers:
      - name: prometheus
        image: prom/prometheus:latest
        args:
        - "--config.file=/etc/prometheus/prometheus.yml"
        - "--storage.tsdb.path=/prometheus"
        - "--storage.tsdb.retention.time=15d"
        - "--web.console.libraries=/usr/share/prometheus/console_libraries"
        - "--web.console.templates=/usr/share/prometheus/consoles"
        ports:
        - containerPort: 9090
        resources:
          requests:
            memory: "2Gi"
            cpu: "500m"
          limits:
            memory: "4Gi"
            cpu: "2000m"
        volumeMounts:
        - name: prometheus-storage
          mountPath: /prometheus
        - name: prometheus-config
          mountPath: /etc/prometheus
        - name: prometheus-rules
          mountPath: /etc/prometheus/rules
      volumes:
      - name: prometheus-storage
        persistentVolumeClaim:
          claimName: prometheus-pvc
      - name: prometheus-config
        configMap:
          name: prometheus-config
      - name: prometheus-rules
        configMap:
          name: prometheus-rules
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: placino-observability
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      evaluation_interval: 15s
      external_labels:
        cluster: "placino-production"
        environment: "prod"

    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          - alertmanager:9093

    rule_files:
    - /etc/prometheus/rules/*.yml

    scrape_configs:
    - job_name: 'prometheus'
      static_configs:
      - targets: ['localhost:9090']

    - job_name: 'kubernetes-apiservers'
      kubernetes_sd_configs:
      - role: endpoints
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https

    - job_name: 'kubernetes-nodes'
      kubernetes_sd_configs:
      - role: node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

    - job_name: 'placino-services'
      kubernetes_sd_configs:
      - role: pod
        namespaces:
          names:
          - placino-backend
          - placino-frontend
          - placino-data
      relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: "true"
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        regex: ([^:]+)(?::d+)?;(d+)
        replacement: $1:$2
        target_label: __address__
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-rules
  namespace: placino-observability
data:
  placino-alerts.yml: |
    groups:
    - name: placino-backend
      interval: 30s
      rules:
      - alert: HighQueryLatency
        expr: histogram_quantile(0.95, query_latency_seconds_bucket) > 5
        for: 5m
        annotations:
          summary: "High query latency detected"
          description: "Query 95th percentile latency is {{ $value }}s"

      - alert: HighErrorRate
        expr: rate(query_errors_total[5m]) > 0.01
        for: 5m
        annotations:
          summary: "High error rate"
          description: "Error rate is {{ $value }}"

      - alert: PodCrashLooping
        expr: rate(kube_pod_container_status_restarts_total[15m]) > 0.1
        for: 5m
        annotations:
          summary: "Pod is crash looping"
          description: "Pod {{ $labels.pod_name }} in namespace {{ $labels.namespace }}"

      - alert: PostgreSQLDown
        expr: pg_up == 0
        for: 1m
        annotations:
          summary: "PostgreSQL is down"
          description: "PostgreSQL has been unavailable for 1 minute"

      - alert: HighDiskUsage
        expr: (kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.85
        for: 5m
        annotations:
          summary: "High disk usage"
          description: "Volume {{ $labels.persistentvolumeclaim }} is {{ $value | humanizePercentage }} full"

      - alert: MemoryPressure
        expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.9
        for: 5m
        annotations:
          summary: "Container memory pressure"
          description: "Container {{ $labels.pod_name }} memory usage is {{ $value | humanizePercentage }}"
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: placino-observability
spec:
  type: ClusterIP
  ports:
  - port: 9090
    targetPort: 9090
  selector:
    app: prometheus
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
  namespace: placino-observability
spec:
  replicas: 2
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      containers:
      - name: grafana
        image: grafana/grafana:latest
        ports:
        - containerPort: 3000
        env:
        - name: GF_SECURITY_ADMIN_PASSWORD
          valueFrom:
            secretKeyRef:
              name: grafana-admin
              key: password
        - name: GF_INSTALL_PLUGINS
          value: "grafana-piechart-panel,grafana-worldmap-panel"
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        volumeMounts:
        - name: grafana-storage
          mountPath: /var/lib/grafana
        - name: grafana-datasources
          mountPath: /etc/grafana/provisioning/datasources
      volumes:
      - name: grafana-storage
        persistentVolumeClaim:
          claimName: grafana-pvc
      - name: grafana-datasources
        configMap:
          name: grafana-datasources
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-datasources
  namespace: placino-observability
data:
  prometheus.yaml: |
    apiVersion: 1
    datasources:
    - name: Prometheus
      type: prometheus
      access: proxy
      url: http://prometheus:9090
      isDefault: true
      editable: true
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: placino-observability
spec:
  type: ClusterIP
  ports:
  - port: 3000
    targetPort: 3000
  selector:
    app: grafana

Production Hardening Checklist

Moving Placino to production requires systematic hardening beyond basic deployment. This checklist covers resource limits, security policies, backup strategies, and compliance controls.

Resource Management

  • Set resource requests and limits on all Deployments, StatefulSets, DaemonSets
  • Enable ResourceQuota per namespace to prevent runaway pods
  • Configure LimitRange to set defaults for new pods
  • Monitor node pressure (memory, disk, PID) via metrics
  • Set up node auto-scaling with cluster autoscaler or Karpenter

Security Policies

  • Enforce Pod Security Standards (restricted profile) at namespace level
  • Run containers as non-root user (securityContext.runAsNonRoot=true)
  • Drop Linux capabilities (drop: ["ALL"]) and only add required ones
  • Mount filesystems as read-only (securityContext.readOnlyRootFilesystem=true)
  • Enable SELinux or AppArmor on worker nodes
  • Use NetworkPolicies to restrict pod-to-pod communication
  • Enable RBAC and audit logging on API server

Data Protection

  • Enable encryption at rest for etcd (api-server --encryption-provider-config)
  • Encrypt Kubernetes Secrets with Vault (not base64)
  • Use TLS 1.3 for all API server and kubelet communication
  • Implement database-level encryption (PostgreSQL pgcrypto, ClickHouse)
  • Configure daily snapshots of PostgreSQL and ClickHouse to object storage
  • Test backup restoration monthly

High Availability

  • Run control plane with 3+ etcd nodes
  • Distribute worker nodes across 3+ availability zones
  • Use PodDisruptionBudgets for critical services (minAvailable: 2)
  • Implement health checks (liveness, readiness, startup probes)
  • Configure graceful termination (terminationGracePeriodSeconds: 60)
  • Monitor MTTR and MTTF metrics

Compliance and Audit

  • Enable API server audit logging (--audit-log-maxage=30)
  • Log all Vault operations for compliance
  • Implement pod-level network monitoring (Cilium/Falco for runtime threats)
  • Regular security scans of container images (Trivy, Snyk)
  • Document all manual kubectl commands (prefer GitOps)
  • Track ConfigMap and Secret changes via version control
  • Schedule quarterly penetration testing

Operational Readiness

  • Implement centralized logging (ELK, Loki, Splunk) with 90-day retention
  • Configure alerting for SLOs (error rate, latency, availability)
  • Create runbooks for common incidents (pod crash, database failover, OOM)
  • Establish on-call rotation and escalation procedures
  • Document cluster maintenance windows
  • Test disaster recovery annually (full cluster rebuild)

Air-Gapped Deployment Considerations

Organizations with strict network isolation (air-gapped environments) cannot pull container images or dependencies from public registries. Placino must be deployed with pre-downloaded images, offline package repositories, and internal registries.

Image Registry Setup

# On a machine with internet access:
# 1. Download all Placino images
docker pull placino/auth:1.4.0
docker pull placino/ingestion:1.4.0
docker pull placino/matching:1.4.0
docker pull placino/query:1.4.0
docker pull placino/governance:1.4.0
docker pull placino/audit:1.4.0
docker pull placino/catalog:1.4.0
docker pull placino/ai-proxy:1.4.0
docker pull placino/audience:1.4.0
docker pull placino/ml:1.4.0
docker pull placino/react-frontend:1.4.0

# 2. Download dependencies
docker pull postgres:15-alpine
docker pull clickhouse/clickhouse-server:latest
docker pull confluentinc/cp-kafka:7.5.0
docker pull valkey/valkey:latest
docker pull minio/minio:latest
docker pull kong:3.4-alpine
docker pull openpolicyagent/opa:latest
docker pull prom/prometheus:latest
docker pull grafana/grafana:latest
docker pull hashicorp/vault:latest

# 3. Export as tar archives
docker save placino/auth:1.4.0 | gzip > placino-auth-1.4.0.tar.gz
docker save placino/ingestion:1.4.0 | gzip > placino-ingestion-1.4.0.tar.gz
# ... repeat for all images

# 4. Transfer to air-gapped network via approved media
# 5. On air-gapped network, load images:
docker load < placino-auth-1.4.0.tar.gz
docker load < placino-ingestion-1.4.0.tar.gz
# ... repeat for all images

# 6. Push to internal registry
docker tag placino/auth:1.4.0 registry.internal/placino/auth:1.4.0
docker push registry.internal/placino/auth:1.4.0
# ... repeat for all images

Helm Chart Air-Gap Configuration

# values-airgap.yaml
image:
  registry: registry.internal
  repository: placino
  tag: "1.4.0"
  pullPolicy: IfNotPresent

imagePullSecrets:
- name: registry-credentials

frontend:
  react:
    image:
      registry: registry.internal
      repository: placino/react-frontend
      tag: "1.4.0"

backend:
  auth:
    image:
      registry: registry.internal
      repository: placino/auth
      tag: "1.4.0"

  ingestion:
    image:
      registry: registry.internal
      repository: placino/ingestion
      tag: "1.4.0"

data:
  postgres:
    image:
      registry: registry.internal
      repository: postgres
      tag: "15-alpine"

  clickhouse:
    image:
      registry: registry.internal
      repository: clickhouse/clickhouse-server
      tag: "latest"

  kafka:
    image:
      registry: registry.internal
      repository: confluentinc/cp-kafka
      tag: "7.5.0"

  valkey:
    image:
      registry: registry.internal
      repository: valkey/valkey
      tag: "latest"

  minio:
    image:
      registry: registry.internal
      repository: minio/minio
      tag: "latest"

kong:
  image:
    registry: registry.internal
    repository: kong
    tag: "3.4-alpine"

opa:
  image:
    registry: registry.internal
    repository: openpolicyagent/opa
    tag: "latest"

observability:
  prometheus:
    image:
      registry: registry.internal
      repository: prom/prometheus
      tag: "latest"

  grafana:
    image:
      registry: registry.internal
      repository: grafana/grafana
      tag: "latest"

vault:
  image:
    registry: registry.internal
    repository: hashicorp/vault
    tag: "latest"

Network Isolation in Air-Gap

# NetworkPolicy for air-gapped environment
# Deny all egress by default, allow only internal communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: airgap-default-deny-egress
  namespace: placino-backend
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  # Allow DNS within cluster
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: UDP
      port: 53
  # Allow internal service discovery
  - to:
    - podSelector: {}
    ports:
    - protocol: TCP
      port: 8000
    - protocol: TCP
      port: 8010
    - protocol: TCP
      port: 8020
    - protocol: TCP
      port: 8030
    - protocol: TCP
      port: 8040
    - protocol: TCP
      port: 8050
    - protocol: TCP
      port: 8070
    - protocol: TCP
      port: 8090
    - protocol: TCP
      port: 8089
    - protocol: TCP
      port: 8088
  # Allow to data layer
  - to:
    - namespaceSelector:
        matchLabels:
          name: placino-data
    ports:
    - protocol: TCP
      port: 5432
    - protocol: TCP
      port: 8123
    - protocol: TCP
      port: 9092
    - protocol: TCP
      port: 6379
    - protocol: TCP
      port: 9000

Helm Chart Dependencies in Air-Gap

# Chart.yaml
apiVersion: v2
name: placino
description: Self-hosted data clean room platform
type: application
version: 1.4.0
appVersion: "1.4.0"

# If using subcharts, download them offline
dependencies:
  - name: postgresql
    version: "12.x.x"
    repository: "file://../charts/postgresql"
    alias: postgres-internal
  - name: prometheus
    version: "15.x.x"
    repository: "file://../charts/prometheus"
  - name: grafana
    version: "6.x.x"
    repository: "file://../charts/grafana"

# Deployment without internet:
# 1. Download dependencies offline
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm dependency update ./placino-helm

# 2. Package with dependencies included
helm package ./placino-helm --destination ./releases

# 3. Transfer to air-gapped environment
# 4. Install from local tarball
helm install placino ./releases/placino-1.4.0.tgz   --namespace placino   --values values-airgap.yaml   --set image.registry=registry.internal

Conclusion

Deploying Placino on Kubernetes scales from development clusters (single node, permissive policies) to production installations (multi-node, air-gapped networks, strict isolation). The patterns outlined in this guide—namespace segmentation, OPA policy enforcement, Vault secret management, and comprehensive observability—form the foundation for secure, reliable data clean room operations.

The key principles apply across all deployment scales: enforce network policies at the CNI level, manage secrets dynamically, distribute workloads across availability zones, monitor every component, and test disaster recovery scenarios regularly. Start with the checklist, deploy incrementally, and harden progressively as production loads increase.

For air-gapped environments, prepare image registries and Helm charts in advance, test the full deployment offline, and maintain documentation for network segmentation and secret rotation procedures.