In this tutorial I am going to explain how OAuth 2.0 works and how to apply it for interacting with Google Analytics API using Python. Google provides for that purpose a Python package – which so far only supports Python 2 though … well.
OAuth2 seems to be quite a mess at first and Google’s documentation on this subject is not that well organized in my opinion. So with this article I do my best to save you the sweat I had to invest. After all it’s not that complicated anyway, as you will probably agree.
What I am going to showcase in this tutorial is how to load web stats from Google Analytics into a fact table with Penthao Kettle/PDI. And then how to represent that fact table with Mondrian 3.6 schema so we can visualize the data with Saiku on Pentaho BI Server. In the end I’ll give my two cents on Saiku Analytics and possible options involving d3.js and Roland Bouman‘s xmla4js.
In case you are new to this I recommend reading my articles on the following topics involved here:
First – let me whet your appetite by showing you what a pretty pretty report you will be able to compose after you finished this tutorial – and there you go (click on the picture to bask in the report’s full glory!):
In my last post I was describing how to calculate “returning visitors” in a customizable way depending on how you want to define “returning”. At work as well as for personal projects I use for ETL processes Pentaho Data Integration (PDI) aka Kettle. PDI provides a step for fetching data from Google Analytics and I am going to describe in this post how to use this feature on the basis of the job I clicked together for the article on “returning visitors”. I will focus on the steps and aspects relevant to the subject.
Google Analytics offers a KPI for “returning visitors” but what if you would like to be more specific about the meaning of “returning”? Actually this figure is customizable with basic API requests and a very simple idea – at least for two consecutive time spans.
Let’s assume we want to know how many visitors from calender week 2013-1 (Dec 31 2012 until Jan 6 2013) returned to the web-site in calender week 2013-2 (Jan 7 2013 until Jan 13 2013). I’ll refer to calender week 2013-1 as T1, to 2013-2 as T2 and to both combined as T1+T2. The function v maps the time span onto the number of visitors then – so v(T1) = 5 means in calender week 2013-1 Analytics counted 5 unique visitors. Then the number of visitors in T2 who also visited in T1 is:
“Number visitors from T1 who came back in T2” = v(T1) + v(T2) – v(T1+T2)