The biggest cost wins come from nine SQL patterns: (1) partition pruning via _TABLE_SUFFIX BETWEEN (10 to 50x cost difference vs derived filters), (2) clustering on source/medium/event_name (30 to 60% reduction on top of partitioning), (3) explicit column selection (never SELECT *)…
How do I reduce BigQuery costs on GA4 exports?
The biggest cost wins come from nine SQL patterns: (1) partition pruning via `_TABLE_SUFFIX BETWEEN` (10–50x cost difference vs derived filters), (2) clustering on source/medium/event_name (30–60% reduction on top of partitioning), (3) explicit column selection (never SELECT *), (4) materialised views for repeated aggregations, (5) `APPROX_COUNT_DISTINCT` instead of COUNT(DISTINCT), (6) dry runs before ad-hoc queries, (7) physical storage billing (up to 70% reduction for compressible data), (8) summary tables that pre-aggregate raw events, and (9) BI Engine for dashboard queries. BigQuery on-demand pricing is $6.25/TB scanned as of 2026, with compute representing 85–90% of total BigQuery cost.
Combined, these patterns can drop a $5,000/month BigQuery bill to under $500.
The 2026 pricing model
Before optimising, understand what you're being charged for:
- On-demand compute: $6.25/TB scanned (was $5/TB before 2026)
- Storage (active): $0.02/GB/month for tables modified in last 90 days
- Storage (long-term): $0.01/GB/month — automatic 50% discount after 90 unmodified days
- Streaming inserts: $0.01/200MB
- BigQuery sandbox: Free — 1TB queries/month, 10GB storage, no streaming
Compute is 85–90% of total cost. Storage is rarely the optimisation target — query patterns are. The numbers below assume on-demand pricing; Editions/flat-rate has different mechanics covered later.
The 9 patterns
Pattern 1: Partition pruning via _TABLE_SUFFIX
The single highest-impact optimisation. GA4 exports are sharded into daily tables (events_20260501, events_20260502...). Use _TABLE_SUFFIX BETWEEN to limit scans:
The bad version looks equivalent but BigQuery cannot use partition pruning on a derived filter. This single mistake can 10–50x your query cost. Always use _TABLE_SUFFIX directly with literal date strings.
Pattern 2: Clustering on summary tables
For your flattened summary tables, cluster by frequently filtered columns:
When you filter by WHERE source='google' AND medium='organic', BigQuery only reads the blocks containing organic Google traffic rather than the entire partition. Clustering reduces bytes scanned 30–60% on top of partitioning — the simplest change with the biggest impact after partition pruning.
Combined partitioning + clustering: 90–99% reduction for well-targeted queries. On a 1TB table, that's scanning 1–10GB instead of the full terabyte.
Pattern 3: Never SELECT *
BigQuery uses columnar storage — it only charges for columns queried. SELECT * scans every column whether you need them or not.
Same data. 17x cost difference based purely on column selection. Always list explicit columns.
Pattern 4: Materialised views for repeated aggregations
If you query the same daily aggregation repeatedly (revenue by channel, sessions by country), materialise it:
Materialised views update incrementally as new data arrives. Querying the view costs cents instead of dollars per query. For dashboards refreshing 100+ times per day, this changes the economics entirely.
Pattern 5: APPROX_COUNT_DISTINCT vs COUNT(DISTINCT)
For unique-user counts where exact precision isn't required:
For dashboards and exploratory analysis where 98% accuracy is fine, APPROX_COUNT_DISTINCT scans dramatically less data and runs faster.
Want to see which hidden implementation gaps are affecting your GA4 data quality?
Pattern 6: Dry run before ad-hoc queries
Before running a query you suspect might be expensive, dry-run it:
The console also shows estimated bytes scanned in the top-right corner before you run any query. Dry runs are free — they tell you the cost before commitment. For exploratory work, the BigQuery Console's "Preview" tab is also free (it reads directly from storage without executing a query).
Pattern 7: Physical storage billing
BigQuery defaults to logical storage billing (uncompressed size). For compressible data (JSON logs, repetitive event params), physical storage billing can reduce costs up to 70%:
Caveats:
- You're also billed for the 7-day Time Travel window
- Fail-safe storage costs apply
- Best for datasets with high compression ratios (text-heavy, repetitive)
For GA4 exports specifically, physical billing typically saves 30–60% on storage costs because event_params is highly compressible.
Pattern 8: Summary tables that pre-aggregate
Instead of querying raw event tables for every dashboard query, build daily summary tables:
Run this once per day via scheduled query. Dashboards then query sessions_daily (small, indexed) instead of raw events (huge). Cost drops by orders of magnitude.
Pattern 9: BI Engine for dashboard queries
For Looker / Looker Studio dashboards making frequent small queries, BI Engine caches frequently-accessed data in memory:
- Sub-second dashboard performance
- Reduced slot consumption
- Lower compute spend
Cost: BI Engine reservation pricing. Worthwhile for dashboards with high query volume — pays back when dashboard queries exceed ~100 per day.
When to switch from On-Demand to Editions
The threshold rule: if your on-demand BigQuery spend consistently exceeds $5,000/month, evaluate Editions (flat-rate slot pricing).
Editions tiers (2026):
- Standard — basic SQL, no Time Travel beyond 2 days
- Enterprise — Time Travel up to 7 days, materialised views, BI Engine
- Enterprise Plus — full features
You buy slots (compute capacity). Pay continuously for baseline; burst above on-demand. Properties with predictable workloads above $5,000/month typically save 30–50% switching to Editions. Below $5,000/month, on-demand is more economical.
What doesn't save money
Three common misconceptions:
1. LIMIT does NOT save money. LIMIT 10 only limits returned rows — BigQuery still scans the same data. Use WHERE clauses with partition columns to actually reduce scan.
2. Storage tier doesn't help when compute dominates. Switching all your storage to long-term storage saves pennies if your real cost is compute. Optimise the queries first, storage second.
3. Sharding tables manually doesn't help. Creating events_20260101, events_20260102 as separate tables (instead of using BigQuery's built-in partitioning) creates metadata overhead and complicates permissions without performance benefit. Use partitioning, not sharding.
FAQ: BigQuery Cost Optimisation for GA4 Exports: 9 SQL Patterns
What should a team validate first when bigquery cost optimisation for ga4 exports: 9 sql patterns appears?
How do I know whether the fix actually worked?
When should this become a full GA4 audit instead of a quick fix?
Related guides for BigQuery Cost Optimisation for GA4 Exports: 9 SQL Patterns
How to Stitch GA4 BigQuery Sessions Manually (2026)
GA4 doesn't store sessions as records in BigQuery exports — only individual events with session identifiers. To reconstruct sessions: join on user_pseudo_id + (SELECT value.int_value FROM UNNEST(event_params) WHERE key='ga_session_id') as the unique session key…
GA4 BigQuery Schema Cheat Sheet: Every Field, What It Does (2026)
The GA4 BigQuery export contains 30+ top-level fields per event row, organised into seven groups: identity (user_pseudo_id, user_id), timing (event_timestamp, event_date), event metadata (event_name, event_params), traffic source (traffic_source.*, session_traffic_source_last_click.* added 2024)…
Run a GA4 audit before bigquery cost optimisation for ga4 exports: 9 sql patterns spreads into reporting decisions
Use GA4 Audits to surface implementation gaps, broken signals, and the next fixes to prioritize before the issue becomes harder to trust or explain.