Google Analytics 4 (GA4) employs behavioral modeling to estimate user activity when identifiers are unavailable, ensuring comprehensive insights despite privacy constraints. This modeling requires specific conditions to be met, such as implementing consent mode across all pages and collecting sufficient data. Key event modeling, on the other hand, is independent of consent mode.
Behavioral modeling in GA4
Behavioral modeling in GA4 estimates user activity when identifiers like cookies or User IDs are not fully available (e.g., when a user opts-out of tracking through a consent management platform like OneTrust), providing insights into user behavior that would otherwise be unobservable. This feature helps maintain complete reporting in scenarios where data collection is limited due to privacy constraints or user preferences. To enable behavioral modeling, certain prerequisites must be met:
- Advanced consent mode must be implemented across all website pages, ensuring Google code loads before the consent dialog appears.
- The property must collect at least 1,000 events daily with ‘analytics_storage=denied’ for a minimum of 7 days.
- Over 1,000 users daily must send events with ‘analytics_storage=granted’ for at least 7 days in the past 28 days.
These conditions ensure sufficient data for training the machine learning model, which adapts to reflect the unique characteristics of each digital property and its user behavior. Note that behavioral modeling can produce data differences between GA4 explorations and reports.
Key event modeling in GA4
Key event modeling in GA4 operates independently of consent mode configurations. Unlike behavioral modeling, which estimates user activity when identifiers are unavailable, key event modeling continues to function regardless of whether users have consented to analytics cookies. However, it’s important to note that while this feature is always applied, its overall accuracy and effectiveness may be influenced by the amount and quality of available data.
GA4 reporting identity settings
Reporting identity settings in GA4 determine how user data is processed and presented in reports. The “Blended” option, which includes behavioral modeling when requirements are satisfied, is one of the available settings for reporting identity. This setting integrates behavioral modeled data seamlessly with observed data in reports, potentially showing higher user counts compared to reports with only observed data.
Check the GA4 data quality icon for a reference to “estimated user data” to determine if the “Blended” reporting identity is in effect and behavioral modeled data is included with observed data for a given GA4 report. However, GA4 reporting does not explicitly indicate how much behavioral modeled vs. observed data is included in reports for a date range. GA4 also does not indicate if or how much modeled vs. observed key events are included in reporting.
NOTE: While the reporting identity setting affects the display of behaviorally modeled data, it does not impact key event modeled data.
No GA4 modeled data in the BigQuery export
There are multiple GA4 reporting surfaces: standard reports, explorations, the Data API and BigQuery. BigQuery, as accessed via a Google Cloud Platform project, it is not part of the GA4 platform – but you can report on raw GA4 data exported to BigQuery.
Reporting from GA4 data exported to BigQuery is valuable as it eliminates high cardinality, sampling, data thresholds & GA4 (other) row instances that can impact the other GA4 reporting surfaces. However, raw BigQuery exported GA4 data is also missing the extra data from data-driven attribution, key event modeling, and behavioral modeling.