Twitter’s REST API v1.1 with R (for Linux and Windows)

twitterIn this tutorial I am going to describe a straightforward way of how to make use of Twitter’s REST API v1.1. For that purpose I composed a little package (RTwitterAPI), so that requesting data just needs the API URL, the API parameters and a vector containing the OAuth parameters.

Before you can get started you have to login to your Twitter account on dev.twitter.comcreate an application and generate an “Access Token” for it. So let’s jump right in and fetch IDs of 10 followers of @hrw (Human Rights Watch). The necessary code is located on GitHub as a package named RTwitterAPI which may be installed using devtools::install_github().

The Linux Way …

… (which might also work for some Windows installations – not mine though) uses RCurl::getURL() for executing the GET request.

The result is a JSON containing the IDs of 10 followers who we are going to print prettified using jsonlite::prettify():

The Windows Way …

… requires you to install Cygwin first – which I recommend anyway because it is pretty awesome – and on Cygwin you have to install cURL. What it does is this – it crafts a full command for Cygwin using cURL and feeds this command string to Cygwin’s bash.exe via system(). The reason for this workaround is an obscure certification issue which I did not manage to resolve properly. After you installed Cygwin (and curl) all that changes is the invocation of RTwitterAPI::twitter_api_call(). In the following example I assume Cygwin to be located under “C:\cygwin64\”.

The bash.exe command is spat out and can be run directly from DOS prompt or from Cygwin – when you restrict the command to the respective part of the string.

 OAuth Version 1

Twitter provides access to its service via a REST API whose current version is 1.1. Authorization is realized through OAuth version 1.0a. Due to that, handling the API is less trivial than just appending your password to the request – but also considerably more secure. Opposed to standard password-based authentication – OAuth distinguishes between the server (Twitter), the third-party client (the program calling the API) and the resource owner (the owner of the API account) by specifying an authentication flow where the resource owner grants access to the third-party client by programmatically providing a secret token in the end. But as in this case the third-party client and the resource owner are effectively identical the process simplifies to just manually creating an access token and its corresponding “token secret” and then using those directly within the script. Referring to the OAuth 1 authentication flow chart – we can skip steps A to F and focus entirely on G.

Signing the Request for OAuth1

The authorization of a request itself happens by glueing together a string which contains all the details about that request – URL, query parameters, OAuth keys and values – and then signing this so called “signature base string” with the two secret tokens – oauth_token_secret and consumer_secret applying an algorithm referred to as HMAC-SHA1. HMAC-SHA1 is provided by the digest package. What you get in the end after some more processing is the oauth_signature which is sent along with the request and verified by the server. The creation of that signature is implemented in RTwitterAPI/oauth1_signature.R – a detailed description may be found here.

Structure of the GET Request

The GET request which is finally sent via RCurl needs a propperly set up header containing an “Authentication” section providing all the various oauth_* settings. This part is implemented in RTwitterAPI/twitter_api_call.R. The meaning of the oauth_*  key/values, as well as the composition of the header is described here.

A few Notes on Escaping with RCurl::curlEscape() and URLencode()

For propper generation of the signature string it is important that to be encoded characters are represented with upper case hexadecimal symbols and that “.”, “_”, “-” and “~” are not encoded. This was a bit tedious to figure out. Both requirements are documented here.

The upper case condition is met by RCurl::curlEscape() but not by URLencode():

The second requirement was met by my Linux R set up – but oddly not by my Window’s where RCurl::curlEscape() would escape “.”, “-“, “_” and “~”. This is why I added a condition to oauth1_signature() which causes those characters to be resubstituted if necessary.


(original article published on www.joyofdata.de)

6 thoughts on “Twitter’s REST API v1.1 with R (for Linux and Windows)

  1. I encountered the error below in Windows 8.1 when attempting to use the example from the README.

    Error in function (type, msg, asError = TRUE) :
    SSL certificate problem, verify that the CA cert is OK. Details:
    error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

    I managed to track down a solution on stackexchange for this issue (it is related to curl’s default options), and it does indeed resolve the issue.

    Run the code below:

    library(RCurl)
    # Set SSL certs globally
    options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))

  2. This vignette has a much simpler example of pulling data from twitter using httr and jsonlite (all the way at the bottom of the vignette).

    • Thanks for sharing. In my humble opinion I find the example you refer to a tad more complicated – but then again tastes differ.

  3. You’d be a lot better off building on top of httr instead of RCurl. httr:

    Does escaping correctly on all platforms
    Has already implemented OAuth 1 (and 2)
    Configures RCurl correctly to work on Windows

    • Hi Hadley, thanks for your suggestion; I will keep that in mind. and on this occasion – thanks also for bringing devtools to the community!

Comments are closed.