Deploying Decentralized Clean Rooms on Kubernetes
Deploying a self-hosted data clean room platform requires orchestrating dozens of microservices, stateful components, and network isolation policies. This guide covers the production deployment of Placino on Kubernetes—from initial architecture decisions through hardening for air-gapped environments.
Prerequisites and Architecture Overview
Placino's architecture consists of 35 Docker containers distributed across 10 Go microservices, 8 infrastructure components, 8 observability services, a React frontend, and an API gateway. Three isolated Docker networks enforce data separation at the container level: the frontend network hosts the React application and Kong gateway, the backend network runs core microservices, and the data network isolates PostgreSQL, ClickHouse, Kafka, Valkey, and MinIO.
Core Services and Ports
The microservices operate on dedicated internal ports:
// Authentication and Gateway
auth-service:8060 # Token generation, SAML/OIDC validation
kong-api-gateway:8000 # Public API endpoint
kong-admin:8001 # Kong administration API
// Core Data Operations
ingestion-service:8010 # Data ingestion, deduplication, normalization
matching-service:8020 # Entity resolution, fuzzy matching
query-service:8030 # Federated SQL execution
governance-service:8040 # Consent, retention, lineage management
audit-service:8050 # Event logging, compliance audit trail
catalog-service:8070 # Data asset discovery, metadata
// AI and ML Pipeline
ai-proxy-service:8090 # LLM integration, model serving
audience-service:8089 # Behavioral segmentation, scoring
ml-service:8088 # Model training, inference
// Infrastructure
opa-sidecar:8181 # Open Policy Agent enforcement
prometheus:9090 # Metrics collection
grafana:3000 # Metrics visualizationKubernetes Prerequisites
- Kubernetes 1.24+ with RBAC enabled
- Helm 3.12+ for chart management
- Container runtime: containerd or Docker with overlay2 storage driver
- Minimum cluster resources: 8 CPU cores, 32GB memory (development); 24 cores, 128GB (production)
- Storage class provisioner (EBS, GCE PD, local-path, or NFS)
- Network policies supported at CNI layer (Calico, Cilium, or native)
Namespace and Network Policy Design
Kubernetes namespaces provide administrative boundaries, but true isolation requires network policies. Placino's three-network architecture translates into separate namespaces with strict ingress/egress rules.
Namespace Segmentation
apiVersion: v1
kind: Namespace
metadata:
name: placino-frontend
labels:
app.kubernetes.io/name: placino
app.kubernetes.io/component: frontend
---
apiVersion: v1
kind: Namespace
metadata:
name: placino-backend
labels:
app.kubernetes.io/name: placino
app.kubernetes.io/component: backend
---
apiVersion: v1
kind: Namespace
metadata:
name: placino-data
labels:
app.kubernetes.io/name: placino
app.kubernetes.io/component: data
---
apiVersion: v1
kind: Namespace
metadata:
name: placino-observability
labels:
app.kubernetes.io/name: placino
app.kubernetes.io/component: observabilityNetwork Policies
Network policies define allowed traffic flows. The frontend namespace permits ingress from external sources but only connects to Kong's admin API in the backend namespace. Backend services can reach data layer components but no cross-service communication within the data namespace is allowed beyond infrastructure requirements.
# Frontend namespace: permit ingress, egress to backend only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-egress-backend
namespace: placino-frontend
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: placino-backend
ports:
- protocol: TCP
port: 8000 # Kong gateway
- protocol: TCP
port: 8001 # Kong admin
- to:
- podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
---
# Backend namespace: permit ingress from frontend, egress to data only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-data-isolation
namespace: placino-backend
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: placino-frontend
ports:
- protocol: TCP
port: 8000
egress:
- to:
- namespaceSelector:
matchLabels:
name: placino-data
ports:
- protocol: TCP
port: 5432 # PostgreSQL
- protocol: TCP
port: 8123 # ClickHouse
- protocol: TCP
port: 9092 # Kafka
- protocol: TCP
port: 6379 # Valkey
- protocol: TCP
port: 9000 # MinIO
- to:
- podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
---
# Data namespace: deny all egress except DNS and inter-data services
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: data-network-isolation
namespace: placino-data
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- podSelector: {}
ports:
- protocol: TCP
port: 5432
- protocol: TCP
port: 8123
- protocol: TCP
port: 9092
- protocol: TCP
port: 6379
- protocol: TCP
port: 9000
- to:
- namespaceSelector:
matchLabels:
name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53Helm Chart Structure
A monolithic Helm chart simplifies upgrades and cross-namespace dependencies. The chart structure organizes resources by concern: one subtemplate per namespace, shared secrets, and configurable values for each microservice.
placino-helm/
├── Chart.yaml
├── values.yaml
├── values-dev.yaml
├── values-prod.yaml
├── templates/
│ ├── _helpers.tpl
│ ├── namespace.yaml
│ ├── network-policies.yaml
│ ├── secrets/
│ │ ├── vault-auth.yaml
│ │ ├── tls-certs.yaml
│ │ └── database-credentials.yaml
│ ├── frontend/
│ │ ├── react-deployment.yaml
│ │ ├── react-service.yaml
│ │ └── react-configmap.yaml
│ ├── kong/
│ │ ├── kong-deployment.yaml
│ │ ├── kong-service.yaml
│ │ ├── kong-configmap.yaml
│ │ └── kong-ingress.yaml
│ ├── backend/
│ │ ├── auth-deployment.yaml
│ │ ├── ingestion-deployment.yaml
│ │ ├── matching-deployment.yaml
│ │ ├── query-deployment.yaml
│ │ ├── governance-deployment.yaml
│ │ ├── audit-deployment.yaml
│ │ ├── catalog-deployment.yaml
│ │ ├── ai-proxy-deployment.yaml
│ │ ├── audience-deployment.yaml
│ │ ├── ml-deployment.yaml
│ │ ├── opa-sidecar-configmap.yaml
│ │ └── backend-services.yaml
│ ├── data/
│ │ ├── postgres-deployment.yaml
│ │ ├── postgres-pvc.yaml
│ │ ├── clickhouse-deployment.yaml
│ │ ├── clickhouse-pvc.yaml
│ │ ├── kafka-statefulset.yaml
│ │ ├── valkey-deployment.yaml
│ │ ├── minio-deployment.yaml
│ │ ├── minio-pvc.yaml
│ │ └── data-services.yaml
│ ├── observability/
│ │ ├── prometheus-deployment.yaml
│ │ ├── prometheus-configmap.yaml
│ │ ├── grafana-deployment.yaml
│ │ ├── grafana-configmap.yaml
│ │ └── observability-services.yaml
│ ├── vault/
│ │ ├── vault-deployment.yaml
│ │ ├── vault-configmap.yaml
│ │ └── vault-service.yaml
│ ├── rbac/
│ │ ├── service-accounts.yaml
│ │ ├── role-bindings.yaml
│ │ └── pod-security-policies.yaml
│ └── ingress.yaml
└── charts/
└── (external chart dependencies if using subchart pattern)values.yaml Structure
replicaCount: 3
environment: production
image:
registry: gcr.io
repository: placino-images
tag: "1.4.0"
pullPolicy: IfNotPresent
frontend:
react:
replicas: 3
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
env:
API_ENDPOINT: "https://api.placino.local"
LOG_LEVEL: "info"
kong:
replicas: 3
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
config:
database: postgresql
dns_resolver: kube-dns
upstream_keepalive: 60
backend:
auth:
replicas: 2
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
env:
AUTH_TIMEOUT: "30s"
TOKEN_TTL: "3600"
SAML_ENABLED: "true"
ingestion:
replicas: 3
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"
batchSize: 10000
deduplicationWindow: "24h"
matching:
replicas: 2
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
algorithm: "fuzzy-leven"
confidenceThreshold: 0.85
query:
replicas: 3
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "3Gi"
cpu: "2000m"
queryTimeout: "300s"
maxConcurrency: 50
governance:
replicas: 2
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
audit:
replicas: 2
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
retention: "90d"
catalog:
replicas: 2
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
aiProxy:
replicas: 2
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"
modelProvider: "openai"
modelName: "gpt-4"
audience:
replicas: 2
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
ml:
replicas: 2
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
data:
postgres:
replicas: 1
resources:
requests:
memory: "4Gi"
cpu: "2000m"
limits:
memory: "8Gi"
cpu: "4000m"
storage: 100Gi
storageClass: "fast-ssd"
maxConnections: 200
sharedBuffers: "2GB"
effectiveCacheSize: "6GB"
clickhouse:
replicas: 1
resources:
requests:
memory: "8Gi"
cpu: "4000m"
limits:
memory: "16Gi"
cpu: "8000m"
storage: 500Gi
storageClass: "fast-ssd"
maxMemoryUsage: "10Gi"
kafka:
replicas: 3
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
storage: 200Gi
storageClass: "standard"
retentionMs: 604800000
valkey:
replicas: 1
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "1000m"
maxmemory: "3gb"
maxmemoryPolicy: "allkeys-lru"
minio:
replicas: 4
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
storage: 1Ti
storageClass: "fast-ssd"
observability:
prometheus:
replicas: 2
storage: 100Gi
storageClass: "fast-ssd"
retention: "15d"
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2000m"
grafana:
replicas: 2
adminPassword: "" # Set via secrets
storage: 10Gi
storageClass: "standard"
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
vault:
enabled: true
replicas: 3
storage: 10Gi
storageClass: "fast-ssd"
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
ingress:
enabled: true
className: "nginx"
tls:
enabled: true
issuer: "letsencrypt-prod"
hosts:
- host: "placino.local"
paths:
- path: /
pathType: Prefix
affinity:
podAntiAffinity: preferred
tolerations: []
nodeSelector: {}Configuring Persistent Storage
Stateful components require careful storage configuration. PostgreSQL, ClickHouse, and MinIO use different I/O patterns and should be provisioned accordingly.
Storage Classes
# Fast SSD for databases
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
type: io1
iops: "1000"
fstype: ext4
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
---
# Standard storage for Kafka, observability
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
provisioner: ebs.csi.aws.com
parameters:
type: gp3
fstype: ext4
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumerPostgreSQL Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
namespace: placino-data
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: placino
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgres-credentials
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
- name: POSTGRES_INITDB_ARGS
value: "-c max_connections=200 -c shared_buffers=2GB -c effective_cache_size=6GB -c maintenance_work_mem=512MB -c checkpoint_completion_target=0.9 -c wal_buffers=16MB -c default_statistics_target=100 -c random_page_cost=1.1 -c effective_io_concurrency=200 -c work_mem=10485kB -c min_wal_size=4GB -c max_wal_size=16GB"
resources:
requests:
memory: "4Gi"
cpu: "2000m"
limits:
memory: "8Gi"
cpu: "4000m"
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
livenessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U $POSTGRES_USER
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U $POSTGRES_USER
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: postgres-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
namespace: placino-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd
resources:
requests:
storage: 100Gi
---
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: placino-data
spec:
type: ClusterIP
ports:
- port: 5432
targetPort: 5432
selector:
app: postgresClickHouse Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: clickhouse
namespace: placino-data
spec:
replicas: 1
selector:
matchLabels:
app: clickhouse
template:
metadata:
labels:
app: clickhouse
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:latest
ports:
- name: http
containerPort: 8123
- name: tcp
containerPort: 9000
env:
- name: CLICKHOUSE_DB
value: placino
resources:
requests:
memory: "8Gi"
cpu: "4000m"
limits:
memory: "16Gi"
cpu: "8000m"
volumeMounts:
- name: clickhouse-storage
mountPath: /var/lib/clickhouse
- name: clickhouse-config
mountPath: /etc/clickhouse-server
livenessProbe:
httpGet:
path: /ping
port: 8123
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ping
port: 8123
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: clickhouse-storage
persistentVolumeClaim:
claimName: clickhouse-pvc
- name: clickhouse-config
configMap:
name: clickhouse-config
---
apiVersion: v1
kind: ConfigMap
metadata:
name: clickhouse-config
namespace: placino-data
data:
config.xml: |
<clickhouse>
<logger>
<level>information</level>
</logger>
<http_port>8123</http_port>
<tcp_port>9000</tcp_port>
<max_memory_usage>10737418240</max_memory_usage>
<max_memory_usage_for_user>10737418240</max_memory_usage_for_user>
</clickhouse>
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: clickhouse-pvc
namespace: placino-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd
resources:
requests:
storage: 500Gi
---
apiVersion: v1
kind: Service
metadata:
name: clickhouse
namespace: placino-data
spec:
type: ClusterIP
ports:
- name: http
port: 8123
targetPort: 8123
- name: tcp
port: 9000
targetPort: 9000
selector:
app: clickhouseMinIO Multi-Node Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: minio
namespace: placino-data
spec:
replicas: 4
selector:
matchLabels:
app: minio
template:
metadata:
labels:
app: minio
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- minio
topologyKey: kubernetes.io/hostname
containers:
- name: minio
image: minio/minio:latest
command:
- minio
- server
- http://minio-0.minio.placino-data.svc.cluster.local/data
- http://minio-1.minio.placino-data.svc.cluster.local/data
- http://minio-2.minio.placino-data.svc.cluster.local/data
- http://minio-3.minio.placino-data.svc.cluster.local/data
ports:
- containerPort: 9000
name: minio
- containerPort: 9001
name: console
env:
- name: MINIO_ROOT_USER
valueFrom:
secretKeyRef:
name: minio-credentials
key: username
- name: MINIO_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: minio-credentials
key: password
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
volumeMounts:
- name: minio-storage
mountPath: /data
livenessProbe:
httpGet:
path: /minio/health/live
port: 9000
initialDelaySeconds: 30
periodSeconds: 20
readinessProbe:
httpGet:
path: /minio/health/ready
port: 9000
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: minio-storage
persistentVolumeClaim:
claimName: minio-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: minio-pvc
namespace: placino-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd
resources:
requests:
storage: 1Ti
---
apiVersion: v1
kind: Service
metadata:
name: minio
namespace: placino-data
spec:
clusterIP: None
ports:
- port: 9000
targetPort: 9000
- port: 9001
targetPort: 9001
selector:
app: minioSecrets Management with Vault
Kubernetes Secrets are base64-encoded, not encrypted at rest by default. HashiCorp Vault provides dynamic secret generation, rotation, and audit logging. Placino integrates Vault for database credentials, API keys, TLS certificates, and encryption keys.
Vault Configuration
# Vault Auth Method: Kubernetes
vault auth enable kubernetes
vault write auth/kubernetes/config token_reviewer_jwt=@/var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_host=https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
# Database Secret Engine
vault secrets enable database
vault write database/config/postgres plugin_name=postgresql-database-plugin allowed_roles="placino-role" connection_url="postgresql://{{username}}:{{password}}@postgres.placino-data:5432/placino" username="vault_admin" password="$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)"
vault write database/roles/placino-role db_name=postgres creation_statements="CREATE USER "{{name}}" WITH PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT CONNECT ON DATABASE placino TO "{{name}}"; GRANT USAGE ON SCHEMA public TO "{{name}}"; GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO "{{name}}";" default_ttl="1h" max_ttl="24h"
# KV Secret Engine
vault secrets enable -version=2 kv
vault kv put kv/placino/minio access_key="$(openssl rand -base64 32)" secret_key="$(openssl rand -base64 32)"
vault kv put kv/placino/tls cert=@/path/to/cert.pem key=@/path/to/key.pem
# Policies
vault policy write placino-policy - <<EOF
path "database/creds/placino-role" {
capabilities = ["read"]
}
path "kv/data/placino/*" {
capabilities = ["read", "list"]
}
path "transit/encrypt/placino" {
capabilities = ["update"]
}
path "transit/compute/placino" {
capabilities = ["update"]
}
EOF
# Kubernetes Auth Role
vault write auth/kubernetes/role/placino bound_service_account_names=placino-backend,placino-data bound_service_account_namespaces=placino-backend,placino-data policies=placino-policy ttl=24hVault Agent Injector
apiVersion: apps/v1
kind: Deployment
metadata:
name: auth-service
namespace: placino-backend
spec:
replicas: 2
selector:
matchLabels:
app: auth
template:
metadata:
labels:
app: auth
annotations:
vault.hashicorp.com/agent-inject: "true"
vault.hashicorp.com/role: "placino"
vault.hashicorp.com/agent-inject-secret-database: "database/creds/placino-role"
vault.hashicorp.com/agent-inject-template-database: |
{{- with secret "database/creds/placino-role" -}}
export DB_USER="{{ .Data.data.username }}"
export DB_PASSWORD="{{ .Data.data.password }}"
{{- end }}
vault.hashicorp.com/agent-inject-secret-minio: "kv/data/placino/minio"
vault.hashicorp.com/agent-inject-template-minio: |
{{- with secret "kv/data/placino/minio" -}}
export MINIO_ACCESS_KEY="{{ .Data.data.access_key }}"
export MINIO_SECRET_KEY="{{ .Data.data.secret_key }}"
{{- end }}
spec:
serviceAccountName: placino-backend
containers:
- name: auth
image: placino/auth:1.4.0
ports:
- containerPort: 8060
env:
- name: DATABASE_URL
value: "postgresql://$(DB_USER):$(DB_PASSWORD)@postgres.placino-data:5432/placino"
- name: VAULT_ADDR
value: "http://vault.placino-data:8200"
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"Kong API Gateway Configuration
Kong acts as the single entry point for all external API traffic. It handles authentication (OAuth2, API keys), rate limiting, request/response transformation, and routing to backend services.
apiVersion: apps/v1
kind: Deployment
metadata:
name: kong
namespace: placino-frontend
spec:
replicas: 3
selector:
matchLabels:
app: kong
template:
metadata:
labels:
app: kong
spec:
containers:
- name: kong
image: kong:3.4-alpine
ports:
- name: proxy
containerPort: 8000
- name: admin
containerPort: 8001
env:
- name: KONG_DATABASE
value: "postgres"
- name: KONG_PG_HOST
value: "postgres.placino-data"
- name: KONG_PG_PORT
value: "5432"
- name: KONG_PG_USER
valueFrom:
secretKeyRef:
name: kong-db-creds
key: username
- name: KONG_PG_PASSWORD
valueFrom:
secretKeyRef:
name: kong-db-creds
key: password
- name: KONG_PG_DATABASE
value: "kong"
- name: KONG_PROXY_ACCESS_LOG
value: "/dev/stdout"
- name: KONG_PROXY_ERROR_LOG
value: "/dev/stderr"
- name: KONG_ADMIN_ACCESS_LOG
value: "/dev/stdout"
- name: KONG_ADMIN_ERROR_LOG
value: "/dev/stderr"
- name: KONG_ADMIN_LISTEN
value: "0.0.0.0:8001"
- name: KONG_DNS_RESOLVER
value: "kube-dns"
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /status
port: 8001
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /status
port: 8001
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: kong-proxy
namespace: placino-frontend
spec:
type: LoadBalancer
ports:
- name: proxy
port: 80
targetPort: 8000
protocol: TCP
- name: proxy-tls
port: 443
targetPort: 8443
protocol: TCP
selector:
app: kong
---
apiVersion: v1
kind: Service
metadata:
name: kong-admin
namespace: placino-frontend
spec:
type: ClusterIP
ports:
- port: 8001
targetPort: 8001
selector:
app: kong
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kong-routes
namespace: placino-frontend
data:
routes.sh: |
#!/bin/bash
KONG_ADMIN_URL="http://kong-admin:8001"
# Upstream services
curl -X POST "$KONG_ADMIN_URL/upstreams" --data name=auth-upstream
curl -X POST "$KONG_ADMIN_URL/upstreams/auth-upstream/targets" --data "target=auth-service.placino-backend:8060"
curl -X POST "$KONG_ADMIN_URL/upstreams" --data name=ingestion-upstream
curl -X POST "$KONG_ADMIN_URL/upstreams/ingestion-upstream/targets" --data "target=ingestion-service.placino-backend:8010"
curl -X POST "$KONG_ADMIN_URL/upstreams" --data name=matching-upstream
curl -X POST "$KONG_ADMIN_URL/upstreams/matching-upstream/targets" --data "target=matching-service.placino-backend:8020"
curl -X POST "$KONG_ADMIN_URL/upstreams" --data name=query-upstream
curl -X POST "$KONG_ADMIN_URL/upstreams/query-upstream/targets" --data "target=query-service.placino-backend:8030"
curl -X POST "$KONG_ADMIN_URL/upstreams" --data name=governance-upstream
curl -X POST "$KONG_ADMIN_URL/upstreams/governance-upstream/targets" --data "target=governance-service.placino-backend:8040"
curl -X POST "$KONG_ADMIN_URL/upstreams" --data name=audit-upstream
curl -X POST "$KONG_ADMIN_URL/upstreams/audit-upstream/targets" --data "target=audit-service.placino-backend:8050"
curl -X POST "$KONG_ADMIN_URL/upstreams" --data name=catalog-upstream
curl -X POST "$KONG_ADMIN_URL/upstreams/catalog-upstream/targets" --data "target=catalog-service.placino-backend:8070"
# Services
curl -X POST "$KONG_ADMIN_URL/services" --data "name=auth" --data "host=auth-upstream" --data "port=8060" --data "protocol=http"
curl -X POST "$KONG_ADMIN_URL/services" --data "name=ingestion" --data "host=ingestion-upstream" --data "port=8010" --data "protocol=http"
# Routes
curl -X POST "$KONG_ADMIN_URL/services/auth/routes" --data "paths[]=/auth" --data "methods[]=POST" --data "methods[]=GET"
curl -X POST "$KONG_ADMIN_URL/services/ingestion/routes" --data "paths[]=/ingest" --data "methods[]=POST"
# Plugins
curl -X POST "$KONG_ADMIN_URL/plugins" --data "name=oauth2" --data "config.enable_authorization_code=true" --data "config.provision_key=$(uuidgen)"
curl -X POST "$KONG_ADMIN_URL/plugins" --data "name=rate-limiting" --data "config.minute=1000" --data "config.hour=10000"
curl -X POST "$KONG_ADMIN_URL/plugins" --data "name=request-transformer" --data "config.add.headers[]=X-Request-ID: $(uuidgen)"
curl -X POST "$KONG_ADMIN_URL/plugins" --data "name=cors" --data "config.origins[]=*" --data "config.methods[]=GET,POST,PUT,DELETE"OPA Sidecar Deployment
Open Policy Agent (OPA) enforces fine-grained access control policies at the application layer. Each backend service runs an OPA sidecar that evaluates incoming requests against policy rules before processing.
apiVersion: v1
kind: ConfigMap
metadata:
name: opa-policies
namespace: placino-backend
data:
data.rego: |
package placino.data
import data.placino.tokens
import data.placino.roles
# Decision function
default allow = false
allow {
input.method == "GET"
tokens.valid[input.token]
roles.can_read[input.user_role]
}
allow {
input.method == "POST"
tokens.valid[input.token]
roles.can_write[input.user_role]
}
allow {
input.method == "DELETE"
tokens.valid[input.token]
roles.is_admin[input.user_role]
}
tokens.rego: |
package placino.tokens
valid[token] {
token := input.token
token != ""
not token_revoked[token]
}
token_revoked[revoked_token] {
revoked_tokens := ["token123", "token456"]
revoked_token := revoked_tokens[_]
}
roles.rego: |
package placino.roles
can_read[role] {
role := "viewer"
}
can_read[role] {
role := "editor"
}
can_read[role] {
role := "admin"
}
can_write[role] {
role := "editor"
}
can_write[role] {
role := "admin"
}
is_admin[role] {
role := "admin"
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: query-service
namespace: placino-backend
spec:
replicas: 3
selector:
matchLabels:
app: query
template:
metadata:
labels:
app: query
spec:
containers:
- name: query
image: placino/query:1.4.0
ports:
- containerPort: 8030
env:
- name: OPA_ADDR
value: "http://localhost:8181"
- name: LOG_LEVEL
value: "info"
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "3Gi"
cpu: "2000m"
- name: opa
image: openpolicyagent/opa:latest
args:
- "run"
- "--server"
- "--addr=localhost:8181"
- "--log-level=info"
- "/policies"
ports:
- containerPort: 8181
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
volumeMounts:
- name: opa-policies
mountPath: /policies
livenessProbe:
httpGet:
path: /health
port: 8181
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: opa-policies
configMap:
name: opa-policiesHorizontal Scaling Patterns
Stateless services (auth, ingestion, matching, query) scale horizontally via Kubernetes Deployments and Horizontal Pod Autoscalers. Kafka provides queueing for decoupled scaling. Stateful services (PostgreSQL, ClickHouse) require different patterns.
Horizontal Pod Autoscaling
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: query-hpa
namespace: placino-backend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: query-service
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 2
periodSeconds: 30
selectPolicy: Max
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ingestion-hpa
namespace: placino-backend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ingestion-service
minReplicas: 3
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 85
- type: Pods
pods:
metric:
name: kafka_consumer_lag_seconds
target:
type: AverageValue
averageValue: "60"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: matching-hpa
namespace: placino-backend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: matching-service
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 85Pod Disruption Budgets
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: query-pdb
namespace: placino-backend
spec:
minAvailable: 2
selector:
matchLabels:
app: query
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: ingestion-pdb
namespace: placino-backend
spec:
minAvailable: 2
selector:
matchLabels:
app: ingestion
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: postgres-pdb
namespace: placino-data
spec:
minAvailable: 1
selector:
matchLabels:
app: postgres
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: kafka-pdb
namespace: placino-data
spec:
minAvailable: 2
selector:
matchLabels:
app: kafkaObservability Stack (Prometheus, Grafana)
Production Kubernetes clusters require metric collection, visualization, and alerting. Prometheus scrapes metrics from Kubernetes API servers, kubelet, and application endpoints. Grafana provides dashboards. AlertManager routes alerts based on severity.
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: placino-observability
spec:
replicas: 2
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: prom/prometheus:latest
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=15d"
- "--web.console.libraries=/usr/share/prometheus/console_libraries"
- "--web.console.templates=/usr/share/prometheus/consoles"
ports:
- containerPort: 9090
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2000m"
volumeMounts:
- name: prometheus-storage
mountPath: /prometheus
- name: prometheus-config
mountPath: /etc/prometheus
- name: prometheus-rules
mountPath: /etc/prometheus/rules
volumes:
- name: prometheus-storage
persistentVolumeClaim:
claimName: prometheus-pvc
- name: prometheus-config
configMap:
name: prometheus-config
- name: prometheus-rules
configMap:
name: prometheus-rules
---
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: placino-observability
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
cluster: "placino-production"
environment: "prod"
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
rule_files:
- /etc/prometheus/rules/*.yml
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
- job_name: 'placino-services'
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- placino-backend
- placino-frontend
- placino-data
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: "true"
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::d+)?;(d+)
replacement: $1:$2
target_label: __address__
---
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-rules
namespace: placino-observability
data:
placino-alerts.yml: |
groups:
- name: placino-backend
interval: 30s
rules:
- alert: HighQueryLatency
expr: histogram_quantile(0.95, query_latency_seconds_bucket) > 5
for: 5m
annotations:
summary: "High query latency detected"
description: "Query 95th percentile latency is {{ $value }}s"
- alert: HighErrorRate
expr: rate(query_errors_total[5m]) > 0.01
for: 5m
annotations:
summary: "High error rate"
description: "Error rate is {{ $value }}"
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0.1
for: 5m
annotations:
summary: "Pod is crash looping"
description: "Pod {{ $labels.pod_name }} in namespace {{ $labels.namespace }}"
- alert: PostgreSQLDown
expr: pg_up == 0
for: 1m
annotations:
summary: "PostgreSQL is down"
description: "PostgreSQL has been unavailable for 1 minute"
- alert: HighDiskUsage
expr: (kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.85
for: 5m
annotations:
summary: "High disk usage"
description: "Volume {{ $labels.persistentvolumeclaim }} is {{ $value | humanizePercentage }} full"
- alert: MemoryPressure
expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.9
for: 5m
annotations:
summary: "Container memory pressure"
description: "Container {{ $labels.pod_name }} memory usage is {{ $value | humanizePercentage }}"
---
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: placino-observability
spec:
type: ClusterIP
ports:
- port: 9090
targetPort: 9090
selector:
app: prometheus
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: placino-observability
spec:
replicas: 2
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:latest
ports:
- containerPort: 3000
env:
- name: GF_SECURITY_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: grafana-admin
key: password
- name: GF_INSTALL_PLUGINS
value: "grafana-piechart-panel,grafana-worldmap-panel"
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
volumeMounts:
- name: grafana-storage
mountPath: /var/lib/grafana
- name: grafana-datasources
mountPath: /etc/grafana/provisioning/datasources
volumes:
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana-pvc
- name: grafana-datasources
configMap:
name: grafana-datasources
---
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasources
namespace: placino-observability
data:
prometheus.yaml: |
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: true
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: placino-observability
spec:
type: ClusterIP
ports:
- port: 3000
targetPort: 3000
selector:
app: grafanaProduction Hardening Checklist
Moving Placino to production requires systematic hardening beyond basic deployment. This checklist covers resource limits, security policies, backup strategies, and compliance controls.
Resource Management
- Set resource requests and limits on all Deployments, StatefulSets, DaemonSets
- Enable ResourceQuota per namespace to prevent runaway pods
- Configure LimitRange to set defaults for new pods
- Monitor node pressure (memory, disk, PID) via metrics
- Set up node auto-scaling with cluster autoscaler or Karpenter
Security Policies
- Enforce Pod Security Standards (restricted profile) at namespace level
- Run containers as non-root user (securityContext.runAsNonRoot=true)
- Drop Linux capabilities (drop: ["ALL"]) and only add required ones
- Mount filesystems as read-only (securityContext.readOnlyRootFilesystem=true)
- Enable SELinux or AppArmor on worker nodes
- Use NetworkPolicies to restrict pod-to-pod communication
- Enable RBAC and audit logging on API server
Data Protection
- Enable encryption at rest for etcd (api-server --encryption-provider-config)
- Encrypt Kubernetes Secrets with Vault (not base64)
- Use TLS 1.3 for all API server and kubelet communication
- Implement database-level encryption (PostgreSQL pgcrypto, ClickHouse)
- Configure daily snapshots of PostgreSQL and ClickHouse to object storage
- Test backup restoration monthly
High Availability
- Run control plane with 3+ etcd nodes
- Distribute worker nodes across 3+ availability zones
- Use PodDisruptionBudgets for critical services (minAvailable: 2)
- Implement health checks (liveness, readiness, startup probes)
- Configure graceful termination (terminationGracePeriodSeconds: 60)
- Monitor MTTR and MTTF metrics
Compliance and Audit
- Enable API server audit logging (--audit-log-maxage=30)
- Log all Vault operations for compliance
- Implement pod-level network monitoring (Cilium/Falco for runtime threats)
- Regular security scans of container images (Trivy, Snyk)
- Document all manual kubectl commands (prefer GitOps)
- Track ConfigMap and Secret changes via version control
- Schedule quarterly penetration testing
Operational Readiness
- Implement centralized logging (ELK, Loki, Splunk) with 90-day retention
- Configure alerting for SLOs (error rate, latency, availability)
- Create runbooks for common incidents (pod crash, database failover, OOM)
- Establish on-call rotation and escalation procedures
- Document cluster maintenance windows
- Test disaster recovery annually (full cluster rebuild)
Air-Gapped Deployment Considerations
Organizations with strict network isolation (air-gapped environments) cannot pull container images or dependencies from public registries. Placino must be deployed with pre-downloaded images, offline package repositories, and internal registries.
Image Registry Setup
# On a machine with internet access:
# 1. Download all Placino images
docker pull placino/auth:1.4.0
docker pull placino/ingestion:1.4.0
docker pull placino/matching:1.4.0
docker pull placino/query:1.4.0
docker pull placino/governance:1.4.0
docker pull placino/audit:1.4.0
docker pull placino/catalog:1.4.0
docker pull placino/ai-proxy:1.4.0
docker pull placino/audience:1.4.0
docker pull placino/ml:1.4.0
docker pull placino/react-frontend:1.4.0
# 2. Download dependencies
docker pull postgres:15-alpine
docker pull clickhouse/clickhouse-server:latest
docker pull confluentinc/cp-kafka:7.5.0
docker pull valkey/valkey:latest
docker pull minio/minio:latest
docker pull kong:3.4-alpine
docker pull openpolicyagent/opa:latest
docker pull prom/prometheus:latest
docker pull grafana/grafana:latest
docker pull hashicorp/vault:latest
# 3. Export as tar archives
docker save placino/auth:1.4.0 | gzip > placino-auth-1.4.0.tar.gz
docker save placino/ingestion:1.4.0 | gzip > placino-ingestion-1.4.0.tar.gz
# ... repeat for all images
# 4. Transfer to air-gapped network via approved media
# 5. On air-gapped network, load images:
docker load < placino-auth-1.4.0.tar.gz
docker load < placino-ingestion-1.4.0.tar.gz
# ... repeat for all images
# 6. Push to internal registry
docker tag placino/auth:1.4.0 registry.internal/placino/auth:1.4.0
docker push registry.internal/placino/auth:1.4.0
# ... repeat for all imagesHelm Chart Air-Gap Configuration
# values-airgap.yaml
image:
registry: registry.internal
repository: placino
tag: "1.4.0"
pullPolicy: IfNotPresent
imagePullSecrets:
- name: registry-credentials
frontend:
react:
image:
registry: registry.internal
repository: placino/react-frontend
tag: "1.4.0"
backend:
auth:
image:
registry: registry.internal
repository: placino/auth
tag: "1.4.0"
ingestion:
image:
registry: registry.internal
repository: placino/ingestion
tag: "1.4.0"
data:
postgres:
image:
registry: registry.internal
repository: postgres
tag: "15-alpine"
clickhouse:
image:
registry: registry.internal
repository: clickhouse/clickhouse-server
tag: "latest"
kafka:
image:
registry: registry.internal
repository: confluentinc/cp-kafka
tag: "7.5.0"
valkey:
image:
registry: registry.internal
repository: valkey/valkey
tag: "latest"
minio:
image:
registry: registry.internal
repository: minio/minio
tag: "latest"
kong:
image:
registry: registry.internal
repository: kong
tag: "3.4-alpine"
opa:
image:
registry: registry.internal
repository: openpolicyagent/opa
tag: "latest"
observability:
prometheus:
image:
registry: registry.internal
repository: prom/prometheus
tag: "latest"
grafana:
image:
registry: registry.internal
repository: grafana/grafana
tag: "latest"
vault:
image:
registry: registry.internal
repository: hashicorp/vault
tag: "latest"Network Isolation in Air-Gap
# NetworkPolicy for air-gapped environment
# Deny all egress by default, allow only internal communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: airgap-default-deny-egress
namespace: placino-backend
spec:
podSelector: {}
policyTypes:
- Egress
egress:
# Allow DNS within cluster
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
# Allow internal service discovery
- to:
- podSelector: {}
ports:
- protocol: TCP
port: 8000
- protocol: TCP
port: 8010
- protocol: TCP
port: 8020
- protocol: TCP
port: 8030
- protocol: TCP
port: 8040
- protocol: TCP
port: 8050
- protocol: TCP
port: 8070
- protocol: TCP
port: 8090
- protocol: TCP
port: 8089
- protocol: TCP
port: 8088
# Allow to data layer
- to:
- namespaceSelector:
matchLabels:
name: placino-data
ports:
- protocol: TCP
port: 5432
- protocol: TCP
port: 8123
- protocol: TCP
port: 9092
- protocol: TCP
port: 6379
- protocol: TCP
port: 9000Helm Chart Dependencies in Air-Gap
# Chart.yaml
apiVersion: v2
name: placino
description: Self-hosted data clean room platform
type: application
version: 1.4.0
appVersion: "1.4.0"
# If using subcharts, download them offline
dependencies:
- name: postgresql
version: "12.x.x"
repository: "file://../charts/postgresql"
alias: postgres-internal
- name: prometheus
version: "15.x.x"
repository: "file://../charts/prometheus"
- name: grafana
version: "6.x.x"
repository: "file://../charts/grafana"
# Deployment without internet:
# 1. Download dependencies offline
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm dependency update ./placino-helm
# 2. Package with dependencies included
helm package ./placino-helm --destination ./releases
# 3. Transfer to air-gapped environment
# 4. Install from local tarball
helm install placino ./releases/placino-1.4.0.tgz --namespace placino --values values-airgap.yaml --set image.registry=registry.internalConclusion
Deploying Placino on Kubernetes scales from development clusters (single node, permissive policies) to production installations (multi-node, air-gapped networks, strict isolation). The patterns outlined in this guide—namespace segmentation, OPA policy enforcement, Vault secret management, and comprehensive observability—form the foundation for secure, reliable data clean room operations.
The key principles apply across all deployment scales: enforce network policies at the CNI level, manage secrets dynamically, distribute workloads across availability zones, monitor every component, and test disaster recovery scenarios regularly. Start with the checklist, deploy incrementally, and harden progressively as production loads increase.
For air-gapped environments, prepare image registries and Helm charts in advance, test the full deployment offline, and maintain documentation for network segmentation and secret rotation procedures.