System Architecture

Placino is a containerized, distributed data clean room built on 10 Go microservices across 35 Docker containers with privacy-preserving matching at its core.

Architecture Overview

Placino separates concerns across three tiers:

API & Ingestion Layer

REST API, gRPC endpoints, and 6 data ingestion channels (CSV, Parquet, Kafka, PostgreSQL, BigQuery, Salesforce API)

Processing & Privacy Layer

Envelope encryption with AES-256-GCM, differential privacy engine, k-anonymity checker, matching service

Storage & Orchestration

PostgreSQL data store, Redis message queue, Prometheus monitoring, Kubernetes orchestration

10 Core Microservices

Each service runs in isolated containers and communicates via Redis message bus:

Core API

REST/gRPC gateway. Routes requests, enforces authentication, manages projects and datasets. Port 8080

Data Ingestion Service

Handles CSV uploads, Parquet streaming, Kafka topic subscription, PostgreSQL CDC, BigQuery exports, Salesforce API pulls.

Encryption Service

AES-256-GCM envelope encryption, key rotation, ephemeral hash generation for record matching.

Matching Engine

Privacy-preserving record linkage using encrypted hashes. Never sees raw PII. Supports deterministic and probabilistic matching.

Query Processor

Parses queries, validates privacy constraints, routes to 5 query modes (template, SQL, NL, aggregation, export).

Differential Privacy Engine

Applies Laplace noise based on epsilon budget, tracks cumulative privacy loss, enforces per-query and per-user epsilon caps.

K-Anonymity Checker

Validates results satisfy k-anonymity before release. Rejects results with cells smaller than k threshold.

Audit & Logging Service

Immutable Merkle-chain audit logs. Records all data access, queries, and privacy parameter consumption.

Activation Service

Integration with 13 ad platforms. Activates query results as segments.

Admin & Governance

User/role management, data subject rights (GDPR DSR), audit log exports, compliance reporting.

Storage & Infrastructure

PostgreSQL

Primary data store for encrypted datasets, metadata, audit logs, user/role definitions. Supports 8 database connectors for external source ingestion.

Redis

Message queue for inter-service communication. Caches encryption keys, query results, and privacy epsilon budgets. Enables horizontal scaling.

Prometheus & Grafana

Metrics collection from all 35 containers. Pre-built dashboards for query latency, privacy budget consumption, ingestion throughput, and error rates.

External Connectors

Databases: PostgreSQL, MySQL, BigQuery, Snowflake, Redshift, ClickHouse, Databricks, Salesforce CRM. Ad Platforms: Google, Meta, TikTok, LinkedIn, and 9 others (13 total).

End-to-End Data Flow

How data moves through the system:

Ingestion

Customer uploads CSV or enables Kafka/PostgreSQL streaming. Data lands in Data Ingestion Service with encryption flag.

Encryption

Encryption Service applies AES-256-GCM envelope encryption to sensitive columns.

Storage

Encrypted data written to PostgreSQL with column-level access controls.

Query

User submits query via API. Query Processor validates it and routes to appropriate engine.

Privacy Enforcement

Differential Privacy Engine applies Laplace noise. K-Anonymity Checker validates result anonymity.

Activation (Optional)

User can activate query results as segments to ad platforms via Activation Service.

Scalability & HA

Horizontal Scaling

All microservices are stateless and can scale horizontally via Kubernetes replicas.

High Availability

PostgreSQL configured with synchronous replication. Redis cluster mode for HA. All services auto-restart on failure.

Network Reduction

Placino reduces network demand 40x vs. traditional data warehouses by keeping PII encrypted.

Explore Further

Dive into specific components:

Data Ingestion Channels Query Engine Modes