ggplot2 basics in action

ggplot2 is for plotting in R, very flexible and ably designed by Hadley Wickham following a concept called “grammar of graphics” and anyway pretty awesome – so let’s jump right in with some simple examples that should help you get it going.

The basic concept

Basically ggplot2 consists of a set of functions addressing various aspects of a plot. They are joined by ‘+’ and by this form a unit describing your desired graphic. Because of well chosen default settings you don’t have to describe every detail of your plot but if necessary then it can be done. You can think of your plot as the result of different layers piled up and thus forming the plot. The data has to be fed as a single data frame keeping the figures in separate named columns (at times referred to as variables).

First of all we need “data” and also to load ggplot2:

Colored scatter plot with paths

a

The function aes() (short for “aesthetics”) just takes the names of the columns you want to plot that’s it.

Now let’s spice it up a bit and color the points depending on the day and increase the size of the points.

 

b

The parameter color within the aes() function defines not what colors to use for which point but what sets of points are supposed to be colored the same. In this case we extract for every couple of values from columns a and b the day from the associated date. Every row has a different date, hence every point is colored differently. Let’s have a look at a different example where I use the remainder of a division with 2 for defining the separate point sets.

c

Now let’s add lines to the plot, connecting the points according to their order in the data frame.

 

 

 

 

 

d

Here you can observe the layer concept at work. First the data layer, then two graphic / geometric layers – first the scatter plot then the paths connecting the points.

Because the coloration is based on format() R uses this formula for the legend title. But of course this is not pleasent to the eye and hence we change it to something more meaningful. The legend (layer) is addressed by a separate function.

e

I think this looks pretty nice already.

Two charts on one plot

Let’s start with a line plot of variable / column a along the time line in d.

fObviously the labeling of the x-axis is not as we want it because there are even hours:minutes displayed. So next we specify that we just want to see the days by addressing the x-axis.

 

 

g

Now we take it a step further displaying two charts in one plot. Because we are now addressing to column pairs d,a and d,b we have to move the “aesthetics” from ggplot() to the respective geom_line().

 

 

h

Now we have the two charts in one plot but two things about it are problematic:

  • The title of the y-axis is only referring to variable a.
  • We don’t know which line is representing which variable.

 

 

i

Exchanging the underlying data frame

Let’s say we want to reuse the above specified plot but we want to use different data. For this purpose we there is special operator ‘%+%’.

j

How to progress?

The good news is that ggplot2 has big community and it seems like almost every question has been addressed already somewhere and can be tracked down with Google. If you can’t find the solution to a problem I recommend to write a question on stackoverflow.com. But as ggplot2 is a very powerful tool it is a good idea to learn it more thoroughly than just by trial and error. For this purpose I recommend the book “ggplot2 – Elegant Graphics for Data Analysis” which is written by its creator. It is referring to a prior version and hence partly outdated but this doesn’t bother much because a lot still is the same and it communicates very well the concept behind ggplot2 which is most important.

Also definitely check out these two official sources:

If this article was helpful to you or you just enjoyed it – then don’t hesitate to leave a comment or share it using the links below this box (that would be uber-awesome)! If not – then you’re welcome to tell me what could be improved? Maybe you even have a suggestion for a new article?
– Thanks, Raffael