What should a team validate first when bigquery cost optimisation for ga4 exports: 9 sql patterns appears?

Reproduce the problem in the live implementation, isolate whether it is scoped to one report or flow, and compare it against at least one secondary source before changing the setup.

How do I know whether the fix actually worked?

You need before-and-after evidence in the browser and in the downstream report. A clean-looking dashboard without validation is not enough.

When should this become a full GA4 audit instead of a quick fix?

If the issue touches attribution, consent, revenue, campaign quality, or data trust for more than one workflow, it is usually safer to audit the surrounding implementation than patch only the visible symptom.

BigQuery Cost Optimisation for GA4 Exports: 9 SQL Patterns (2026)

The biggest cost wins come from nine SQL patterns: (1) partition pruning via _TABLE_SUFFIX BETWEEN (10 to 50x cost difference vs derived filters), (2) clustering on source/medium/event_name (30 to 60% reduction on top of partitioning), (3) explicit column selection (never SELECT *)…

How do I reduce BigQuery costs on GA4 exports?

The biggest cost wins come from nine SQL patterns: (1) partition pruning via `_TABLE_SUFFIX BETWEEN` (10–50x cost difference vs derived filters), (2) clustering on source/medium/event_name (30–60% reduction on top of partitioning), (3) explicit column selection (never SELECT *), (4) materialised views for repeated aggregations, (5) `APPROX_COUNT_DISTINCT` instead of COUNT(DISTINCT), (6) dry runs before ad-hoc queries, (7) physical storage billing (up to 70% reduction for compressible data), (8) summary tables that pre-aggregate raw events, and (9) BI Engine for dashboard queries. BigQuery on-demand pricing is $6.25/TB scanned as of 2026, with compute representing 85–90% of total BigQuery cost.

Combined, these patterns can drop a $5,000/month BigQuery bill to under $500.

The 2026 pricing model

Before optimising, understand what you're being charged for:

On-demand compute: $6.25/TB scanned (was $5/TB before 2026)
Storage (active): $0.02/GB/month for tables modified in last 90 days
Storage (long-term): $0.01/GB/month — automatic 50% discount after 90 unmodified days
Streaming inserts: $0.01/200MB
BigQuery sandbox: Free — 1TB queries/month, 10GB storage, no streaming

Compute is 85–90% of total cost. Storage is rarely the optimisation target — query patterns are. The numbers below assume on-demand pricing; Editions/flat-rate has different mechanics covered later.

The 9 patterns

Pattern 1: Partition pruning via _TABLE_SUFFIX

The single highest-impact optimisation. GA4 exports are sharded into daily tables (events_20260501, events_20260502...). Use _TABLE_SUFFIX BETWEEN to limit scans:

The bad version looks equivalent but BigQuery cannot use partition pruning on a derived filter. This single mistake can 10–50x your query cost. Always use _TABLE_SUFFIX directly with literal date strings.

Pattern 2: Clustering on summary tables

For your flattened summary tables, cluster by frequently filtered columns:

When you filter by WHERE source='google' AND medium='organic', BigQuery only reads the blocks containing organic Google traffic rather than the entire partition. Clustering reduces bytes scanned 30–60% on top of partitioning — the simplest change with the biggest impact after partition pruning.

Combined partitioning + clustering: 90–99% reduction for well-targeted queries. On a 1TB table, that's scanning 1–10GB instead of the full terabyte.

Pattern 3: Never SELECT *

BigQuery uses columnar storage — it only charges for columns queried. SELECT * scans every column whether you need them or not.

Same data. 17x cost difference based purely on column selection. Always list explicit columns.

Pattern 4: Materialised views for repeated aggregations

If you query the same daily aggregation repeatedly (revenue by channel, sessions by country), materialise it:

Materialised views update incrementally as new data arrives. Querying the view costs cents instead of dollars per query. For dashboards refreshing 100+ times per day, this changes the economics entirely.

Pattern 5: APPROX_COUNT_DISTINCT vs COUNT(DISTINCT)

For unique-user counts where exact precision isn't required:

For dashboards and exploratory analysis where 98% accuracy is fine, APPROX_COUNT_DISTINCT scans dramatically less data and runs faster.

Want to see which hidden implementation gaps are affecting your GA4 data quality?

Start free audit

Pattern 6: Dry run before ad-hoc queries

Before running a query you suspect might be expensive, dry-run it:

The console also shows estimated bytes scanned in the top-right corner before you run any query. Dry runs are free — they tell you the cost before commitment. For exploratory work, the BigQuery Console's "Preview" tab is also free (it reads directly from storage without executing a query).

Pattern 7: Physical storage billing

BigQuery defaults to logical storage billing (uncompressed size). For compressible data (JSON logs, repetitive event params), physical storage billing can reduce costs up to 70%:

Caveats:

You're also billed for the 7-day Time Travel window
Fail-safe storage costs apply
Best for datasets with high compression ratios (text-heavy, repetitive)

For GA4 exports specifically, physical billing typically saves 30–60% on storage costs because event_params is highly compressible.

Pattern 8: Summary tables that pre-aggregate

Instead of querying raw event tables for every dashboard query, build daily summary tables:

Run this once per day via scheduled query. Dashboards then query sessions_daily (small, indexed) instead of raw events (huge). Cost drops by orders of magnitude.

Pattern 9: BI Engine for dashboard queries

For Looker / Looker Studio dashboards making frequent small queries, BI Engine caches frequently-accessed data in memory:

Sub-second dashboard performance
Reduced slot consumption
Lower compute spend

Cost: BI Engine reservation pricing. Worthwhile for dashboards with high query volume — pays back when dashboard queries exceed ~100 per day.

When to switch from On-Demand to Editions

The threshold rule: if your on-demand BigQuery spend consistently exceeds $5,000/month, evaluate Editions (flat-rate slot pricing).

Editions tiers (2026):

Standard — basic SQL, no Time Travel beyond 2 days
Enterprise — Time Travel up to 7 days, materialised views, BI Engine
Enterprise Plus — full features

You buy slots (compute capacity). Pay continuously for baseline; burst above on-demand. Properties with predictable workloads above $5,000/month typically save 30–50% switching to Editions. Below $5,000/month, on-demand is more economical.

What doesn't save money

Three common misconceptions:

1. LIMIT does NOT save money. LIMIT 10 only limits returned rows — BigQuery still scans the same data. Use WHERE clauses with partition columns to actually reduce scan.

2. Storage tier doesn't help when compute dominates. Switching all your storage to long-term storage saves pennies if your real cost is compute. Optimise the queries first, storage second.

3. Sharding tables manually doesn't help. Creating events_20260101, events_20260102 as separate tables (instead of using BigQuery's built-in partitioning) creates metadata overhead and complicates permissions without performance benefit. Use partitioning, not sharding.