👋 In this article, we'll clarify how to count your users correctly in GA4 and BigQuery, by comparing two key approaches:
-COUNT
(DISTINCT) for absolute precision.
-HyperLogLog
(HLL) for large-scale performance.
The aim here is to help you choose intelligently, according to your use cases, between analytical rigor and budgetary control.

Background: why is user counting strategic?

A good user count is much more than just a line on a dashboard.
It's the foundation of your business KPIs:

  • It influences your conversion rates, performance reports and marketing decisions.
  • A counting error? This distorts your customer acquisition cost (CAC), your ROAS calculations, and can lead you to invest... on the wrong basis.

Common problems with GA4 alone

  • Incomplete or sampled data (especially for large volumes).
  • History limited to 14 months.
  • Restricted access to certain fields such as basket contents or user IDs.

👉 Exporting to BigQuery becomes indispensable for accessing raw data in its entirety, avoiding sampling and carrying out advanced customized analyses.

Two methods, two philosophies

1. COUNT(DISTINCT)

Counting exact based on a unique identifier (e.g: user_id).

  • Advantages: Total precision, legally required (RGPD).
  • Disadvantages: Very costly for large volumes.

Example:

SELECT COUNT(DISTINCT user_pseudo_id)
FROM `project.dataset.events_*`
WHERE device.category != 'bot'
💡 SEO tips : Exclude bots with WHERE device.category != 'bot'.

2. HyperLogLog (HLL)

A probabilistic algorithm designed to quickly and inexpensively estimate the number of unique elements (e.g.: number of users, emails, sessions, etc.).
  • Benefits: 99% accuracy, 60-90%cost reduction , idealfor processing very large volumes of data, perfect for trending and daily analysis.
  • Disadvantages: it's not an exact count (the results are very close, but not rigorously accurate) and therefore unusable for reports that require rigorously accurate results, such as legal or financial reports.

Example:

SELECT HLL_COUNT.EXTRACT(HLL_COUNT.INIT(user_pseudo_id, 14))
FROM `project.dataset.events_*`
💡 The second parameter (here 14) corresponds to the desired precision (between 10 and 24). The higher the number, the more accurate (and expensive) the estimate. Google uses 14 by default.

⚖️ Comparison: Which method for which use?

👉Check data regularly with COUNT to validate HLL estimates.
Criteria
COUNT(DISTINCT)
HyperLogLog (HLL)
Recommendation
Precision
✅ 100%
⚠️ 99% (estimated)
COUNT for critical decisions
BigQuery cost
💸 high (~€12.50/req)
💰 low (~€0.85/req)
HLL for massive volumes
Speed
⏳ slow
⚡ fast
HLL for live/daily dashboards
Legal compliance
✅ RGPD compliant
❌ not enough on its own
COUNT mandatory for audits (especially monthly)
Marketing use
Segmentation, retargeting
Global trends
Combine the two intelligently

Practical Applications

Case studies

"GA4 underestimates my SEO traffic!"

➡️ Check in BigQuery with COUNT(DISTINCT) on sessions or users.

This is often when you realize that GA4 sampling was costing you 10-20% in lost traffic.

"My Ads campaigns don't seem very profitable..."

➡️ HLL allows you to track trends at low cost.

Cross it with a monthly COUNT to identify underestimated levers.

Testimonial

We performed these tests on data exported from Google Analytics to BigQuery, for the period January 01, 2025 to March 12, 2025.

The comparative results of the three methods with the GA4 interface are available in the following file:

Google Sheets - Analysis results

Integrated FAQ

Q: "Why do GA4 and BigQuery give different results?"

R: GA4 samples the data, while BigQuery uses the raw data.

Q: "Is HyperLogLog reliable for strategic decisions?"

R: Yes to 99.9%, but validate with COUNT for critical ratios.

Compare COUNT(DISTINCT) for total precision, but expensive and HyperLogLog for fast, cost-effective estimates. Use COUNT for critical audits and HLL for large-scale trends.

Convictions
‍"User counting is not an exact science... but a science of intelligent compromise."
  • ☑️ Don't make accuracy sacred: 99% = enough to act.
  • ☑️ GA4 + BigQuery = forced marriage: One without the other is blind or paralyzed.
  • ☑️ HLL is a superpower: Learn to harness it, or pay the price

A need, a question?

Write to us at hello@starfox-analytics.com.
Our team will get back to you as soon as possible.

Contents
Post Tab Link
Post Tab Link

Follow Starfox Analytics on Linkedin so you don't miss a thing.