Sunday, July 28, 2013

Even Google can't get their numbers straight

Google has so many various entities and products, either grown within the organisation or externally acquired. It appears that even Google, the leader in Data Science and Analytics, cannot get all the numbers straight across their products: Google Analytics vs. Blogger.

Is this blog really that popular? Really?

While I was checking this blog's traffic numbers on Blogger's built-in "Stats" function, I was really surprised that the blog seems to be really popular, even though I have not been good (sorry!) at writing much for some time. As an ex-SEO'er, I had an inkling that something is not right. Up comes Google Analytics.

Blogger Stats numbers are 4.5 times bigger than Google Analytics'.

After checking my Google Analytics (GA) numbers. I was really surprised to see that the Blogger Pageview numbers were 4.5 times bigger than the GA numbers. That is a staggering difference!

After some research on the web, I concluded that:
  1. GA is much closer to the truth (but not quite completely true, see 3 below).
  2. Blogger stats include all kinds of bots traffic, so it's heavily inflated (GA tries to filter most out).
  3. GA cannot count any traffic if the user has disabled Javascript. Some folks suggest it undercounts traffic by 50%, but there is no hard evidence to back it up, so take it with a grain of salt.
  4. Blogger seems set on reporting only Pageviews, not any other useful metrics, such as Visits or Unique Visitors. Not sure why.
  5. This blog has probably been targeted by a spam bot. Upon closer look, one of the bots probably comes from a particular Dutch ISP.

Share best practice and be consistent.

I would have expected Google, the leader in Data Science and Analytics, to share best practice amongst its entities and products, such as reporting on key metrics (not just Pageviews).

I would also have expected Google to be able to have a consistent set of numbers amongst its entities and products. Doesn't appear so neither.

The majority of a Business Intelligence (BI) analyst's job is spent verifying and reconciling numbers amongst various reports, more often than not. Major BI tech giants sell BI applications that often allude to reducing such activities and increasing business confidence in the numbers in their data warehouse. However, it is still a major challenge to most companies, as evidenced here. Without a good and reliable data source, the validity of any following analysis is heavily undermined.

Let's try to stay consistent.
That goes for the metric choice, and the numbers.

FYI: if you want to find out if and who is attacking your site with spam bots, read this helpful post.

No comments: