Skip to main content

Data Architecture & Routing

This page explains what data BPP touches, where it lives, and how it moves from BigQuery to downstream platforms.

Where data lives

TierLocationContents
Customer dataCustomer's BigQuery project (customer-controlled)Raw user tables, event/transaction tables, enriched *_bpp tables, and AI model output tables. Read in place; results written back; not duplicated into a ByTek data lake.
Identity graphByTek-managed Cloud SQL (PostgreSQL), one isolated database per customer, GCP EUThe bytek_id mapping and the identifiers required to reconcile users across tables, used solely for identity resolution.
ETL / model processingByTek EU infrastructure (GCP Belgium + Hetzner, Germany)Transient processing during reconciliation, feature engineering, and scoring.

A metadata table in the customer's BigQuery dataset (bpp_schema_info, the schema registry) declares every table/field, identifier type, PII flag, and visibility setting. PII flags govern how each field may be used and activated.

:::tip Optional client-side anonymization For regulated workloads, data can be anonymized or pseudonymized on the customer side before BPP ingests it (as implemented for our banking clients), so that ByTek never processes directly identifying raw data. :::

Read, process, write

┌───────────────────────────────────────────────────────────┐
│ CUSTOMER GCP PROJECT — BigQuery Data Warehouse │
│ • user tables • event/transaction tables │
│ • bpp_schema_info (PII flags, visibility) │
│ • *_bpp enriched tables + AI model outputs (written back) │
└───────────────────────────────────────────────────────────┘
▲ read (no duplication) ▲ write-back of results
│ │ (stays in customer BQ)
▼ │
┌───────────────────────────────────────────────────────────┐
│ ByTek control plane — EU/EEA (GCP Belgium + Hetzner DE) │
│ identity resolution · AI models · feature composer │
│ ↕ per-customer reconciliation DB (Cloud SQL Postgres, EU) │
└───────────────────────────────────────────────────────────┘
  • Read: BPP reads source tables directly from the customer's BigQuery project.
  • Process: identity reconciliation (a 3-step delta pipeline per source table), feature engineering, and model scoring, on ByTek's EU infrastructure.
  • Write-back: enriched tables (*_bpp) and AI model outputs are written back into the customer's BigQuery project, exposed via per-instance views and keyed to bytek_id.

Activation — routing from BigQuery to other platforms

Only pseudonymized identifiers and predictive/segment values leave the platform, over TLS 1.2+ encrypted API calls. Identifiers are normalized and SHA-256-hashed before upload.

Audience Manager — user segments

DestinationWhat is sentProtection
Google Ads — Customer MatchMatching identifiers + segment membershipPII normalization + SHA-256 hashing before upload
Meta Ads — Custom AudiencesMatching identifiers + segment membershipPII normalization + SHA-256 hashing before upload

Signals Manager — event-based conversions

DestinationWhat is sentNotes
Google Ads — Enhanced Conversions for Leads (ECL)Conversion event + hashed identifiers + valueValue computed via a constrained formula parser
Google Ads — Conversion AdjustmentAdjustment event + value
Meta Ads — Conversions API (CAPI)Conversion event + hashed identifiers + value
Generic REST API endpointConfigurable payloadAuth: None / API Key / Basic / Bearer Token
Customer BigQuery ──► Audience/Signals Manager ──► [SHA-256 hashing, TLS 1.2+] ──►
Google Ads · Meta Ads · Generic REST API
(hashed IDs + predicted values only — no raw PII, no behavioral/transactional rows)

Activation destinations (Google Ads, Meta) act as independent controllers for the data they receive, under the customer's own platform terms. The supported source warehouse is currently BigQuery only.