What's in the GA4 BigQuery export schema?
The GA4 BigQuery export contains 30+ top-level fields per event row, organised into seven groups: identity (user_pseudo_id, user_id), timing (event_timestamp, event_date), event metadata (event_name, event_params), traffic source (traffic_source.*, session_traffic_source_last_click.* added 2024), device and geo (device.*, geo.*), e-commerce (ecommerce.*, items array), and derived fields (privacy_info.*, is_active_user, session_traffic_source_last_click.*).
Three of those fields are nested arrays — event_params, user_properties, and items — which need UNNEST() to query. The schema is consistent across all GA4 properties; the only variation is whether session_traffic_source_last_click is present (depends on export configuration date).
This post is the reference. Bookmark it.
The complete schema (30+ top-level fields)
Identity fields
| Field | Type | What it contains |
|---|---|---|
user_pseudo_id | STRING | Anonymous user ID (the GA cookie value). Persistent across sessions for the same browser. |
user_id | STRING | Custom user ID set via the User-ID feature. NULL if not implemented. |
pseudo_user_properties | RECORD (repeated) | User-scoped properties set with set_user_properties calls. |
user_first_touch_timestamp | INT64 | Timestamp of the user's first touch (first session start). |
Timing fields
| Field | Type | What it contains |
|---|---|---|
event_timestamp | INT64 | Microseconds since Unix epoch when the event was logged client-side. |
event_date | STRING | Event date in YYYYMMDD format (the table suffix). |
event_previous_timestamp | INT64 | Microseconds since Unix epoch of the previous event from the same user. |
event_server_timestamp_offset | INT64 | Difference between client and server timestamp in microseconds. |
Event metadata
| Field | Type | What it contains |
|---|---|---|
event_name | STRING | The event name (e.g., page_view, purchase, session_start). |
event_params | RECORD (repeated) | Array of key-value parameters attached to the event. Critical field — most analysis lives here. |
event_value_in_usd | FLOAT64 | Currency-converted value if event has a value parameter. |
event_bundle_sequence_id | INT64 | The sequential ID of the bundle this event was uploaded in. |
stream_id | STRING | Identifies the data stream the event came from. |
platform | STRING | The platform that generated the event (WEB, IOS, ANDROID). |
Traffic source
| Field | Type | What it contains |
|---|---|---|
traffic_source.source | STRING | First-touch source (lifetime user attribution). |
traffic_source.medium | STRING | First-touch medium. |
traffic_source.name | STRING | First-touch campaign name. |
session_traffic_source_last_click.manual_campaign | RECORD | Last-click source for the current session (added 2024). |
collected_traffic_source | RECORD | Source data from the actual collected event (newer field). |
The session_traffic_source_last_click field, added in 2024, gives session-level last-click attribution directly. Before this field, you had to extract source/medium from the session_start event's params.
Device and geo
| Field | Type | What it contains |
|---|---|---|
device.category | STRING | desktop, mobile, or tablet. |
device.mobile_brand_name | STRING | Apple, Samsung, etc. |
device.mobile_model_name | STRING | iPhone 15, Galaxy S24, etc. |
device.operating_system | STRING | iOS, Android, Windows, macOS, Linux. |
device.operating_system_version | STRING | OS version string. |
device.web_info.browser | STRING | Chrome, Safari, Firefox, Edge. |
device.web_info.browser_version | STRING | Browser version. |
device.language | STRING | Browser language preference (en-us, fr-fr). |
geo.continent | STRING | The continent. |
geo.country | STRING | The country (e.g., "United Kingdom"). |
geo.region | STRING | Subnational region (state, province). |
geo.city | STRING | City. |
geo.metro | STRING | Metro area code (US-specific). |
E-commerce
| Field | Type | What it contains |
|---|---|---|
ecommerce.transaction_id | STRING | Transaction ID for purchase events. |
ecommerce.purchase_revenue | FLOAT64 | Revenue from the transaction (excluding tax/shipping). |
ecommerce.purchase_revenue_in_usd | FLOAT64 | Revenue in USD. |
ecommerce.refund_value | FLOAT64 | Refund amount. |
ecommerce.shipping_value | FLOAT64 | Shipping cost. |
ecommerce.tax_value | FLOAT64 | Tax amount. |
ecommerce.unique_items | INT64 | Number of unique items in the transaction. |
ecommerce.total_item_quantity | INT64 | Total quantity across all items. |
items | RECORD (repeated) | Array of items in the event (product details). |
Derived and metadata
| Field | Type | What it contains |
|---|---|---|
is_active_user | BOOLEAN | Whether the user is considered active per GA4's definition. |
privacy_info.analytics_storage | STRING | Consent state for analytics_storage. |
privacy_info.ads_storage | STRING | Consent state for ad_storage. |
privacy_info.uses_transient_token | STRING | Whether the user uses a transient token (consent-denied modelled events). |
app_info.id | STRING | App package name (mobile only). |
app_info.version | STRING | App version (mobile only). |
Want to see which hidden implementation gaps are affecting your GA4 data quality?
How to unpack nested arrays
Three fields are repeated records that need UNNEST() to query:
event_params
The value.string_value, value.int_value, value.float_value, and value.double_value fields hold the actual parameter values depending on type. Use the COALESCE pattern when the type is mixed:
items
user_properties
Standard event_params worth knowing
Every GA4 event includes a default set of parameters. The most common:
| Param key | Value type | What it contains |
|---|---|---|
ga_session_id | int_value | The session identifier (used for sessionisation) |
ga_session_number | int_value | Sequential session number for the user (1, 2, 3...) |
page_location | string_value | Full URL of the page |
page_title | string_value | The page title |
page_referrer | string_value | The referring URL |
engagement_time_msec | int_value | Engagement time in milliseconds for this event |
session_engaged | int_value | Whether the session is engaged (1 or 0) |
entrances | int_value | Whether this is the entry event of the session (1 or 0) |
source, medium, campaign | string_value | Traffic source from the URL on the event |
gclid, dclid | string_value | Click ID parameters when present |
For purchase events specifically, additional params:
| Param key | Value type | What it contains |
|---|---|---|
currency | string_value | Currency code (USD, GBP, EUR) |
value | float_value or int_value | Total transaction value |
transaction_id | string_value | Transaction ID (mirrors ecommerce.transaction_id) |
coupon | string_value | Coupon code applied |
What changed in 2024–25
Three meaningful schema additions in the past 18 months:
`session_traffic_source_last_click` (added 2024). Top-level field giving session-level last-click attribution without needing to extract from session_start event params. Older properties may not have it for historical data — only forward.
`is_active_user` (added 2024). Boolean indicating whether GA4 considers the user active per its internal definition. Useful for filtering bot or spam events that GA4 has flagged as inactive.
`privacy_info.uses_transient_token` (added 2025). Indicates whether the user uses a transient token — the marker for consent-denied modelled events. Useful for separating modelled from observed data in BigQuery analyses.
Older properties may have legacy fields no longer documented (e.g., event_dimensions.hostname was deprecated in 2023 in favour of extracting hostname from page_location). When in doubt, query the schema directly: SELECT column_name, data_type FROM project.analytics_PROPERTY_ID.INFORMATION_SCHEMA.COLUMNS WHERE table_name LIKE 'events_%' LIMIT 100;
FAQ: GA4 BigQuery Schema Cheat Sheet: Every Field, What It Does
What should a team validate first when ga4 bigquery schema cheat sheet: every field, what it does appears?
How do I know whether the fix actually worked?
When should this become a full GA4 audit instead of a quick fix?
Related guides for GA4 BigQuery Schema Cheat Sheet: Every Field, What It Does
BigQuery Cost Optimisation for GA4 Exports: 9 SQL Patterns (2026)
The biggest cost wins come from nine SQL patterns: (1) partition pruning via _TABLE_SUFFIX BETWEEN (10–50x cost difference vs derived filters), (2) clustering on source/medium/event_name (30–60% reduction on top of partitioning), (3) explicit column selection (never SELECT *)…
How to Stitch GA4 BigQuery Sessions Manually (2026)
GA4 doesn't store sessions as records in BigQuery exports — only individual events with session identifiers. To reconstruct sessions: join on user_pseudo_id + (SELECT value.int_value FROM UNNEST(event_params) WHERE key='ga_session_id') as the unique session key…
Run a GA4 audit before ga4 bigquery schema cheat sheet: every field, what it does spreads into reporting decisions
Use GA4 Audits to surface implementation gaps, broken signals, and the next fixes to prioritize before the issue becomes harder to trust or explain.