System Architecture
Placino is a containerized, distributed data clean room built on 10 Go microservices across 35 Docker containers with privacy-preserving matching at its core.
Architecture Overview
Placino separates concerns across three tiers:
API & Ingestion Layer
REST API, gRPC endpoints, and 6 data ingestion channels (CSV, Parquet, Kafka, PostgreSQL, BigQuery, Salesforce API)
Processing & Privacy Layer
Envelope encryption with AES-256-GCM, differential privacy engine, k-anonymity checker, matching service
Storage & Orchestration
PostgreSQL data store, Redis message queue, Prometheus monitoring, Kubernetes orchestration
10 Core Microservices
Each service runs in isolated containers and communicates via Redis message bus:
Core API
REST/gRPC gateway. Routes requests, enforces authentication, manages projects and datasets. Port 8080
Data Ingestion Service
Handles CSV uploads, Parquet streaming, Kafka topic subscription, PostgreSQL CDC, BigQuery exports, Salesforce API pulls.
Encryption Service
AES-256-GCM envelope encryption, key rotation, ephemeral hash generation for record matching.
Matching Engine
Privacy-preserving record linkage using encrypted hashes. Never sees raw PII. Supports deterministic and probabilistic matching.
Query Processor
Parses queries, validates privacy constraints, routes to 5 query modes (template, SQL, NL, aggregation, export).
Differential Privacy Engine
Applies Laplace noise based on epsilon budget, tracks cumulative privacy loss, enforces per-query and per-user epsilon caps.
K-Anonymity Checker
Validates results satisfy k-anonymity before release. Rejects results with cells smaller than k threshold.
Audit & Logging Service
Immutable Merkle-chain audit logs. Records all data access, queries, and privacy parameter consumption.
Activation Service
Integration with 13 ad platforms. Activates query results as segments.
Admin & Governance
User/role management, data subject rights (GDPR DSR), audit log exports, compliance reporting.
Storage & Infrastructure
PostgreSQL
Primary data store for encrypted datasets, metadata, audit logs, user/role definitions. Supports 8 database connectors for external source ingestion.
Redis
Message queue for inter-service communication. Caches encryption keys, query results, and privacy epsilon budgets. Enables horizontal scaling.
Prometheus & Grafana
Metrics collection from all 35 containers. Pre-built dashboards for query latency, privacy budget consumption, ingestion throughput, and error rates.
External Connectors
Databases: PostgreSQL, MySQL, BigQuery, Snowflake, Redshift, ClickHouse, Databricks, Salesforce CRM. Ad Platforms: Google, Meta, TikTok, LinkedIn, and 9 others (13 total).
End-to-End Data Flow
How data moves through the system:
Ingestion
Customer uploads CSV or enables Kafka/PostgreSQL streaming. Data lands in Data Ingestion Service with encryption flag.
Encryption
Encryption Service applies AES-256-GCM envelope encryption to sensitive columns.
Storage
Encrypted data written to PostgreSQL with column-level access controls.
Query
User submits query via API. Query Processor validates it and routes to appropriate engine.
Privacy Enforcement
Differential Privacy Engine applies Laplace noise. K-Anonymity Checker validates result anonymity.
Activation (Optional)
User can activate query results as segments to ad platforms via Activation Service.
Scalability & HA
Horizontal Scaling
All microservices are stateless and can scale horizontally via Kubernetes replicas.
High Availability
PostgreSQL configured with synchronous replication. Redis cluster mode for HA. All services auto-restart on failure.
Network Reduction
Placino reduces network demand 40x vs. traditional data warehouses by keeping PII encrypted.