GA4 Sampling and Thresholds: When to Trust Your Reports
GA4 is frequently described as an unsampled analytics platform, but this is only partially true. Sampling and data thresholds still apply in specific contexts, and understanding when they kick in is essential for knowing how much to trust the numbers in any given report.
Where Sampling Applies in GA4
GA4 standard reports run on aggregated data tables that are pre computed for common metrics and dimensions, these are unsampled. The sampling risk emerges in Explorations.
When you use the Explore workspace (formerly Analysis Hub) for custom reports, GA4 applies sampling if the exploration query would need to process more than a certain number of events within the date range.
The threshold for GA4 360 properties is higher than for free properties, GA4 360 applies sampling after approximately 1 billion events in an exploration query, while the free tier threshold is significantly lower.
You will see a yellow or orange indicator in the exploration interface when sampling is active, along with an estimate of what percentage of data the result is based on.
Reports using short date ranges are almost never sampled. The risk area is explorations with long date ranges, many dimensions, or strict segment filters applied to high traffic properties.
Data Thresholds: The Less-Discussed Problem
Separate from sampling, GA4 applies data thresholds to reports when including certain dimensions could allow individual users to be re identified.
This applies most often when Google Signals is enabled and when reports are segmented by demographics, device, or geographic dimensions at a granular level.
When a data threshold is applied, row counts for small segments are withheld entirely, not estimated, but removed, to prevent the data from revealing information about a small identifiable group.
The threshold indicator in GA4 reports looks similar to the sampling indicator, but has a different implication: sampled data is an estimate of all data, while thresholded data simply has rows removed.
This makes thresholds more dangerous for analysis because the missing rows are not distributed proportionally across all segments, they are concentrated in the smallest, most granular segments that may be exactly what you are trying to analyse.
How to Get Unsampled, Unthresholded Data
For analysis that requires complete, unsampled data, BigQuery export is the definitive solution.
GA4's daily BigQuery export contains every event as a raw row, without sampling or thresholds, because it is not subject to the user privacy restrictions that apply to the GA4 reporting interface.
Running queries against BigQuery for critical analyses, revenue calculations, funnel completion rates, cohort analysis, gives you the ground truth that the GA4 interface approximates.
For organisations that do not have BigQuery, an alternative is to use the GA4 Data API for programmatic data extraction, this also bypasses sampling for most queries, though data thresholds can still apply depending on the dimensions requested.
Understanding the difference between what the GA4 interface shows and what the underlying data contains is a prerequisite for making high stakes decisions with confidence.
Ready to audit your GA4 property?
Run a full GA4 audit in under 10 minutes. Free to start.
Start Free Audit