Data Ingestion Channels
Placino supports 6 flexible data ingestion channels, from one-time CSV uploads to real-time streaming from enterprise data warehouses.
Overview
Channel 1: CSV Upload
Drag-and-drop CSV files via web UI or API. Automatic schema detection, encoding handling, and encryption.
Best for: Ad hoc datasets, pilot programs
Channel 2: Parquet Streaming
Stream Parquet files from S3, GCS, or local storage. Columnar compression reduces ingestion bandwidth 50-80x.
Best for: Large batch exports, data warehouse snapshots
Channel 3: Kafka Topics
Real-time streaming from Kafka topics. Automatic offset tracking, back-pressure handling, and idempotent writes.
Best for: Real-time event streams, CDP integrations
Channel 4: PostgreSQL CDC
Change Data Capture from PostgreSQL. Listens to logical replication slots for INSERT/UPDATE/DELETE events.
Best for: CRM sync, transactional data pipelines
Channel 5: BigQuery Export
Direct SQL query export from BigQuery. Supports scheduled exports and incremental snapshots with dedup.
Best for: Cloud data warehouse ingestion
Channel 6: Salesforce API
Ingest Salesforce Contacts, Leads, and Accounts. Automatic field mapping, incremental updates via SOQL.
Best for: CRM data collaboration, lead scoring
Channel Details
CSV Upload
Simplest option for testing or one-time datasets.
Max size: 1GB per file. Automatic UTF-8 detection. Supports quotes, escapes, and custom delimiters.
Parquet Streaming
Efficient bulk ingestion for data warehouse snapshots.
Supports S3, GCS, Azure Blob, local file://, http://. Automatic decompression (snappy, gzip, lz4).
Kafka Streaming
Real-time event ingestion from message brokers.
Supports Avro, JSON, Protobuf schemas. Automatic offset management. Exactly-once delivery semantics.
PostgreSQL CDC
Listen to database changes from PostgreSQL logical replication.
Requires logical replication slot on source database. Supports any table schema.
BigQuery Export
Export query results from BigQuery with scheduled runs.
Supports parameterized queries for incremental loads. Automatic service account setup.
Salesforce API
Sync Contacts, Leads, Accounts from Salesforce CRM.
OAuth 2.0 flow. Field-level mapping. Incremental sync via SystemModstamp.
Encryption During Ingestion
All ingestion channels apply envelope encryption automatically:
Sensitive columns (email, phone, SSN) are encrypted with AES-256-GCM.
PII hashes (for matching) are ephemeral and only exist during query execution.
Non-sensitive columns (age_group, brand, region) remain queryable in plaintext.
Ingestion logs are immutable in Merkle-chain audit trail.
Schema Management
Placino automatically detects or lets you specify column types and sensitivity: