Building a Reliable DV360 and Ads Data Hub Pipeline with BigQuery, Cloud Run, and AWS SQS

Engineering lessons from building a reliable marketing data pipeline with DV360, Ads Data Hub, BigQuery, Cloud Run, Cloud Tasks, and AWS SQS.

Building a Reliable DV360 and Ads Data Hub Pipeline with BigQuery, Cloud Run, and AWS SQS

January 15, 2026

I recently built a production Google Marketing Platform data pipeline that moves DV360 advertising data from Google Cloud to AWS. Three event-driven pipelines, each with its own trigger pattern: DV360 metadata sync with parent-before-child ordering, DV360 report auto-detection with field normalization, and Ads Data Hub match-rate analysis with async orchestration. All three deliver to AWS SQS with a BigQuery outbox pattern so no messages are silently dropped.

This post covers the architecture decisions and reliability patterns behind the DV360 & Ads Data Hub data pipeline case study.

The Real Problem: Reliability, Not Just Extraction

Getting data out of DV360 is not the hard part. Google provides managed data transfers that land campaign metadata in BigQuery. The hard part is everything that happens after:

  • Ordered delivery: DV360's campaign hierarchy has strict parent-child dependencies. If a line item arrives at the downstream system before its parent insertion order, the message is rejected.
  • Unpredictable report tables: DV360 creates report landing tables in BigQuery with date-suffixed names. Three report types arrive independently with different column sets. There's no schedule to rely on.
  • Long-running async APIs: Ads Data Hub match-rate queries take minutes, subject to differential privacy checks that can fail unpredictably with a mandatory 7-hour cooldown.
  • Cross-cloud delivery guarantees: Every message sent from GCP to AWS SQS needs a receipt. Missing messages need to be detected, not silently ignored.

A simple "extract and push" pipeline would break on any of these. The system needed to handle all four simultaneously.

Why Event-Driven Pipelines

Every pipeline in this system reacts to data arrival rather than running on a schedule. Three different Eventarc trigger patterns drive the three pipelines:

PipelineTriggerWhat Fires It
Metadata SyncPub/Sub messageDV360 data transfer completion event
Reports SyncBigQuery audit logNew report table created in BigQuery
ADH AnalyticsGCS object finalizeSales CSV file uploaded to Cloud Storage

This matters because scheduled pipelines (cron-based) either run too early (no data yet) or too late (stale data). Event-driven pipelines start processing the moment data arrives. Each pipeline uses a different trigger pattern because each data source has a different arrival mechanism.

All three converge on the same delivery pattern: BigQuery outbox, SQS batch sends, receipt tracking, and reconciliation.

Handling Ordered DV360 Metadata Delivery

DV360's campaign hierarchy has 8 entity types across 6 dependency tiers:

TierEntity Types
0Advertiser
1Campaign
2Insertion Order
3Line Item
4Line Item Targeting, Creative
5Ad Group, Ad Group Ad

Delivery must be tier-sequential: tier 0 finishes completely before tier 1 starts. Within a tier, work is parallelized across shards using FARM_FINGERPRINT on advertiser ID for deterministic, retry-safe distribution.

The orchestration uses a completion-chaining pattern: a Cloud Run Service ("tier controller") receives a Cloud Tasks callback for each tier, launches a Cloud Run Job with parallel shard workers, waits for completion, then enqueues the next tier.

The Loopback Pattern

Cloud Tasks has a ~30-minute dispatch deadline. Cloud Run Jobs processing large entity sets can run for hours. If the tier controller is still waiting when its deadline approaches, it returns HTTP 200 (preventing Cloud Tasks from retrying), then re-enqueues itself with a resume_execution parameter. The next invocation picks up where the previous one left off, polling the same Job execution instead of starting a new one.

This converts a hard timeout into a series of short-lived, resumable polling sessions. No duplicate Jobs, no timeout failures.

Normalizing Unpredictable Report Tables

DV360 creates report landing tables in BigQuery with date-suffixed names. Three report types -- display traffic, YouTube traffic, and reach -- arrive independently with different column sets.

An Eventarc trigger on BigQuery's ADMIN_WRITE audit log fires whenever a table is created. The pipeline parses the report type from the table name prefix, ignores non-matching tables, and processes valid report types through the same outbox delivery pattern.

The interesting part is field transformation. DV360 exports fields in snake_case (insertion_order_id, active_view_viewable_impressions). The downstream system expects dot-notation (flight.id, viewableImpressions). Workers apply a mapping layer that:

  • Transforms dimension fields to hierarchical dot-notation
  • Maps metric fields to standardized names
  • Separates dimensions from metrics into distinct JSON payload sections
  • Provides default fallback values for optional browser and OS fields

The transformation happens at the worker level, not in the outbox. The outbox stores raw DV360 data (built by pure SQL procedures), and workers apply the mapping on read. This keeps outbox construction simple and puts transformation logic in Python where it's testable.

Orchestrating Long-Running Ads Data Hub Workflows

Ads Data Hub is a privacy sandbox where queries run inside Google's infrastructure. You can join first-party data against YouTube impression data, but results are subject to differential privacy checks. The pipeline needed to automate the full lifecycle: sales CSV ingestion, schema resolution, UPDM table creation, match-rate query execution, and result delivery.

The solution is a 6-state async state machine driven entirely by Cloud Tasks callbacks:

CREATED -> UPDM_SUBMITTED -> UPDM_DONE -> MATCH_RATE_SUBMITTED -> MATCH_RATE_DONE -> DONE

Each state transition submits work to the ADH API, schedules a Cloud Task to check on it later, and advances the state when the operation completes. Expected-state guards on every transition prevent duplicate processing if a Cloud Task fires twice.

Three Distinct Error Paths

ADH failures aren't all the same, so they need different handling:

  • Retryable errors (429, 500, 502, 503, 504): exponential backoff up to 5 attempts
  • Privacy errors (differential privacy thresholds not met): 7-hour mandatory cooldown before retry, detected by keyword matching in error messages
  • Fatal errors: batch marked as failed, no further tasks scheduled

A Cloud Scheduler job runs every 6 hours as a safety net, hitting a recovery endpoint that re-schedules any batch stuck in a non-terminal state. If new data arrived during processing, the system auto-triggers a new batch on completion.

Dynamic CSV Schema Resolution

Sales CSV files arrive from multiple sources with inconsistent column names (Email, email_address, EMAIL, e-mail). Rather than rejecting files or maintaining per-source mappings, a schema resolution layer normalizes headers and matches them against an alias map. The alias map is defined in a JSON configuration file, making it extensible without code changes.

Why I Used a BigQuery Outbox Before AWS SQS

You can't atomically write to BigQuery and send to SQS in the same transaction. This creates three failure modes:

  1. Message sent but no record of it
  2. Record written but message never sent
  3. Partial batch -- some succeed, some fail

The transactional outbox pattern solves this. A BigQuery stored procedure builds a table of messages to send, each with a unique ID, before anything is sent. Workers read outbox rows, send to SQS in batches of 10 (SendMessageBatch), and write a receipt record for every successfully sent message.

This pattern is used identically across all three pipelines. The outbox is always built inside BigQuery's transactional guarantees, then delivery happens as a separate phase with full tracking.

How Each Pipeline Uses the Outbox

MetadataReportsADH
WorkerCloud Run Job (parallel shards)Cloud Run Job (parallel shards)Cloud Run Service (Flask)
PayloadEntity JSON from field configTransformed fields (dot-notation)Match-rate metrics
RecoveryAuto-close + force-close proceduresAuto-close + force-close proceduresRecovery endpoint + Cloud Scheduler

Reconciliation: Making Failures Visible

After all messages are sent, a reconciliation procedure compares COUNT(*) of outbox rows against COUNT(DISTINCT) of receipt records. Matching counts produce a SENT status. A mismatch produces PARTIAL and triggers an alert.

This is the key reliability guarantee: a message is never silently lost. Either it's delivered and receipted, or the reconciliation flags it. COUNT(DISTINCT) on receipts handles at-least-once receipt writes (a worker could write a receipt, crash, restart, and write it again).

Additional reliability patterns across the system:

  • Concurrent run prevention: guard procedures check for active runs before starting new ones
  • Stuck batch recovery: auto-close for runs stuck longer than 4 hours
  • Deterministic shard assignment: FARM_FINGERPRINT ensures retry-safe worker allocation
  • Idempotent infrastructure: each pipeline's full GCP footprint provisioned in a single command

Lessons for Similar Data Pipeline Projects

After building this system, a few principles stand out:

Event-driven over scheduled. If your data arrives unpredictably, react to it instead of polling. Eventarc supports Pub/Sub messages, BigQuery audit logs, and GCS object events -- three patterns that cover most data arrival scenarios on GCP.

Outbox before external delivery. Any time you're sending data to an external system (SQS, webhook, API), build the outbox first. The cost of tracking what you intended to send is small. The cost of figuring out what was silently dropped is large.

Separate orchestration from compute. Cloud Workflows handles the outer flow (guard, merge, build outbox, enqueue). Cloud Tasks handles inner loops (tier chaining, state machine transitions). Cloud Run handles actual compute. Each layer is independently debuggable and replaceable.

Design for the API's failure modes. ADH has privacy cooldowns. SQS has partial batch failures. Cloud Tasks has dispatch deadlines. Each of these needed a specific code path, not a generic retry loop.

Reconcile everything. Expected-vs-actual comparisons catch problems that logs miss. If your system sends messages and you can't prove each one was delivered, you have a reliability gap.

What's Next

If your reporting workflow depends on APIs, spreadsheets, ad platforms, or asynchronous exports, I can help turn it into a reliable pipeline with validation, monitoring, and clean reporting outputs. The patterns here -- event-driven triggers, outbox delivery, receipt reconciliation, async state machines -- transfer to any system that needs to move marketing or advertising data between cloud platforms reliably.

View the full DV360 & Ads Data Hub data pipeline case study or get in touch to discuss your project.

If you're looking for a data engineer to build BigQuery pipelines, marketing data infrastructure, or cross-cloud delivery systems, check out my other data engineering projects or reach out directly.


If you're building DV360 data pipelines, Ads Data Hub workflows, or marketing data infrastructure on GCP, let's talk.

Working through a messy reporting workflow, API integration, or BigQuery pipeline?

I can help design and build the reliable version.