OAuth 2.0 for Google (Analytics) API with Python Explained

oauth2In this tutorial I am going to explain how OAuth 2.0 works and how to apply it for interacting with Google Analytics API using Python. Google provides for that purpose a Python package – which so far only supports Python 2 though … well.

OAuth2 seems to be quite a mess at first and Google’s documentation on this subject is not that well organized in my opinion. So with this article I do my best to save you the sweat I had to invest. After all it’s not that complicated anyway, as you will probably agree.

Getting Started – Registering the Client Application

  • Install google-api-python-client
  • Download the example Python script (console.py) from GitHub and store it in folder X
  • Visit Google Developers Console
    1. Create project
    2. APIs & auth > APIs – activate Analytics API
    3. APIs & auth > Credentials – create new Client ID (Installed Application, default settings)
    4. APIs & auth > Consent screen – configure “Product Name” and “E-Mail adress”
    5. APIs & auth > Credentials – Download JSON and store as “client_secrets.json” in folder X. The secrets file contains your application’s ID and a secret token to authenticate it.
  • Log in to a Google account which can access your Google Analytics account

Flowing through OAuth 2.0 Step by Step

The numbers I use here refer to the steps accordingly numbered in above process chart.

(1) Run console.py

This will query GA for the number of users from January 1 2014 until December 31 2014 for the profile with ID 123456. If you replace 123456 with 0, then the script will find out the first available profile ID of your GA account and use that.

(2) The (so called) “Flow” begins with opening a browser and redirecting it to the consent page via the authorization URL. In this case it would be:

The different components are documented here.

  • scope : read permissions for Google Analytics
  • redirect_url : authorization code via copy/paste (instead of via HTTP request)
  • response_type : for now we want the authorization code
  • client_id : for identifying the application
  • access_type : eventually we want not just an access token (“online”) but also a refresh token (“offline”) – think: if we have the refresh token we can renew the access token, even when the user is offline.

consent_screen(2a) The user is shown a consent page where s/he can confirm the authorization request.

(2b) If the user refuses – that’s that – if not then …

(3) … the authorization server now communicates the authorization code to the application. How this is done depends on the setup. In case of a web application the authorization would simply again redirect the user’s browser – this time to the redirect URI with the authorization code attached to the request. In our case this is not possible b/c we are working with a simple Python script which does not feature a web server. That’s the reason for the weird redirect URI (urn:ietf:wg:oauth:2.0:oob) specified above – it causes the authorization server to simply send the authorization code to the user’s browser so (3′) the user can copy it and then paste it to the console where the script is already waiting.

(4) With the authorization code at hand the application can now request an access token which is required so it can finally query the Google Analytics API. Actually (due to access_type being set to “offline” in the authorization URL above) we not just (5) get an access token – but also a refresh token, so our application can automatically rerequest a new access token, after it turned invalid, without requiring user interaction.

Steps (2) to (5) are covered by:

Querying Google Analytics

The credentials object resulting from the authorization dance is then used to extend an HTTP object which again is used for creating an API specific service object.

The API call is then built on top of the service object by chaining methods. A succinct interactive documentation of possible API calls is provided by Google’s API Explorer. The first item analytics.data.ga.get corresponds to our call here.

The result of the query will be encapsulated in a JSON object:

 Storing and Reusing Credentials

The credentials – that is the access token and the refresh token – are going to be valid beyond the session during which they were acquired. Hence it makes sense to store and reuse them. For that purpose a Storage class is provided convenient persistency. I nonetheless chose here to use serialization with JSON as this method is more versatilely applicable.

A serialized credentials object looks as follows:


(original article published on www.joyofdata.de)

4 thoughts on “OAuth 2.0 for Google (Analytics) API with Python Explained

  1. Thanks for the article! That graph helps a lot to understand the process.

    Assuming the data is public (no need to authenticate, like a tweet), why do I still need the autorization server? Can the authentication be skipped in that case since the data I’m trying to access is public?

  2. Pingback: Another OAuth 2 explanation for Python access to Google Analytics | small means Big

  3. Great writeup.  I agree with you, much sweat equity to get this setup, but once it’s up it’s pretty simple to understand.   For some of your readers, if they are looking to move data in/out of GA through a scheduled process, they may want to consider switching to using a Service Account. This prevents having to have the consent screen to authorize access. The Service account setup also required extensive mind headaches, and the documentation on Google is lacking.  I documented the process to get that up and running here (https://smallmeansbig.wordpress.com/2014/11/23/how-to-call-a-google-api-using-a-service-account-part-1-of-2/).  The example is to use Python to use a Service Account the Content API for Shopping, but the concept is the same.

    • Hey David, that’s a good hint from you regarding usage of a service account in case of automated regular data loading. Kudos for your write-up!

Comments are closed.