Data Architecture & Routing
This page explains what data BPP touches, where it lives, and how it moves from BigQuery to downstream platforms.
Where data lives
| Tier | Location | Contents |
|---|---|---|
| Customer data | Customer's BigQuery project (customer-controlled) | Raw user tables, event/transaction tables, enriched *_bpp tables, and AI model output tables. Read in place; results written back; not duplicated into a ByTek data lake. |
| Identity graph | ByTek-managed Cloud SQL (PostgreSQL), one isolated database per customer, GCP EU | The bytek_id mapping and the identifiers required to reconcile users across tables, used solely for identity resolution. |
| ETL / model processing | ByTek EU infrastructure (GCP Belgium + Hetzner, Germany) | Transient processing during reconciliation, feature engineering, and scoring. |
A metadata table in the customer's BigQuery dataset (bpp_schema_info, the schema registry)
declares every table/field, identifier type, PII flag, and visibility setting. PII flags
govern how each field may be used and activated.
:::tip Optional client-side anonymization For regulated workloads, data can be anonymized or pseudonymized on the customer side before BPP ingests it (as implemented for our banking clients), so that ByTek never processes directly identifying raw data. :::
Read, process, write
┌───────────────────────────────────────────────────────────┐
│ CUSTOMER GCP PROJECT — BigQuery Data Warehouse │
│ • user tables • event/transaction tables │
│ • bpp_schema_info (PII flags, visibility) │
│ • *_bpp enriched tables + AI model outputs (written back) │
└───────────────────────────────────────────────────────────┘
▲ read (no duplication) ▲ write-back of results
│ │ (stays in customer BQ)
▼ │
┌───────────────────────────────────────────────────────────┐
│ ByTek control plane — EU/EEA (GCP Belgium + Hetzner DE) │
│ identity resolution · AI models · feature composer │
│ ↕ per-customer reconciliation DB (Cloud SQL Postgres, EU) │
└───────────────────────────────────────────────────────────┘
- Read: BPP reads source tables directly from the customer's BigQuery project.
- Process: identity reconciliation (a 3-step delta pipeline per source table), feature engineering, and model scoring, on ByTek's EU infrastructure.
- Write-back: enriched tables (
*_bpp) and AI model outputs are written back into the customer's BigQuery project, exposed via per-instance views and keyed tobytek_id.
Activation — routing from BigQuery to other platforms
Only pseudonymized identifiers and predictive/segment values leave the platform, over TLS 1.2+ encrypted API calls. Identifiers are normalized and SHA-256-hashed before upload.
Audience Manager — user segments
| Destination | What is sent | Protection |
|---|---|---|
| Google Ads — Customer Match | Matching identifiers + segment membership | PII normalization + SHA-256 hashing before upload |
| Meta Ads — Custom Audiences | Matching identifiers + segment membership | PII normalization + SHA-256 hashing before upload |
Signals Manager — event-based conversions
| Destination | What is sent | Notes |
|---|---|---|
| Google Ads — Enhanced Conversions for Leads (ECL) | Conversion event + hashed identifiers + value | Value computed via a constrained formula parser |
| Google Ads — Conversion Adjustment | Adjustment event + value | — |
| Meta Ads — Conversions API (CAPI) | Conversion event + hashed identifiers + value | — |
| Generic REST API endpoint | Configurable payload | Auth: None / API Key / Basic / Bearer Token |
Customer BigQuery ──► Audience/Signals Manager ──► [SHA-256 hashing, TLS 1.2+] ──►
Google Ads · Meta Ads · Generic REST API
(hashed IDs + predicted values only — no raw PII, no behavioral/transactional rows)
Activation destinations (Google Ads, Meta) act as independent controllers for the data they receive, under the customer's own platform terms. The supported source warehouse is currently BigQuery only.