Pivoting over death causes in Germany using Calc from LibreOffice 3.6

This article is going to cover the usage of pivot tables and charts with Calc LibreOffice 3.6 using statistics on death causes in Germany from 1980 to 2010 – categorizing gender and 18 age groups. I am tempted to make this a data labeled article – but as the meaningfulness of absolute numbers regarding a population whose age distribution is changing over time is very restricted I won’t (did I leave out commas here?). As a matter of fact I covered fatalities caused by viral hepatitis in a previous article in more detail – I use this cause of death also in this post.

Some clarifying articles

To understand how I obtained the table on death causes in the structure described below, I recommend reading a few preceding articles on this blog. But to just get an idea how pivoting is organized in Calc it is not required to do so – it’s not rocket science.

The table we use

The table with the statistics on death causes keeps 89’280 values in separate rows. Each row consists of four fields keeping the value for each characteristic forming the key which references the actual value we want to investigate.

  • year € {1980, …, 2010} ; #=31
  • gender € {m, f} ; #=2
  • age € {‘0’, ‘1-14′, ’15-19′, ’20-24′, …, ’80-84′, ’85-inf’, ‘?’} ; #=18
  • cause € { … ‘HIV’ … ‘pneumonia’ … ‘SIDS’ … } ; #=80
hence: 89’280 = 31 x 2 x 18 x 80
First you need the data you want to pivot in what is called a “long format”. What that means and how you get there I cover in the above listed articles. So lets assume we have the data properly structured on a spreadsheet.

Pivoting aka ‘squeezing the table’

Now we just select a cell in this table and choose ‘Data’ / ‘Pivot Table’ / ‘Create …’ from the menu accepting the default settings.

As with most technologies and tools covered in this blog I am not even going to try to explain every detail. So just play around a bit with the settings – most are evident – others subtle and might be covered in coming articles.

Now here we can decide where to place which characteristic. Which is not simple to explain with words but very easy to understand when I just give you an example.

Let’s say we want to know how many men and women (age independent) died per year of viral hepatits from 1980 until 2010. A possible layout looks like this:

Which leads to this table on a new spreadsheet:

So let’s take the value in the row for 1984 and column for female. As you know this reference is actually referring to 1440 (= 80 causes x 18 age groups) values. This is why you can choose in the layout dialogue via the button “Options …” (which is cut out from the screen shot) a type of aggregation. Usually you will be interested in summing the counts, but you can just as well have the average or variance calculated – the currently chosen type of aggregation is shown in cell A5 in above example.

But we want to restrict the causes of death to viral hepatitis, so that’s what we choose now from the “Page Fields” section.

A chart says more than a thousand cells

For turning a table into a chart you select ‘Insert’ / ‘Chart …’, choose a line type and click ‘Finish’. Now the chart won’t yet contain the data you intended it to. For that you double click on the chart area to put the application focus on it – mind how also the main menus and the tool bars are changing. Now you can choose from the chart’s context menu ‘Data Ranges …’.

Next to the input field labeled “Data range” you see an icon with a green arrow – when you click on it the focus is back on the spreadsheet and you can select the cells you want to reference instead of entering the coordinates manually. Let’s choose the columns for “f”, “m” and the year including the headers which contain the column names. Make sure both check boxes are set, so Calc automatically chooses the top row cells for the names of the chart lines and the first column as the categories on the x-axis.

Using the settings on “Data Series” you can fine tune the different charted data.

In case the sudden increase makes you curious about what is going on, check out my article on ‘Increase of deaths due to viral hepatitis in the year 1998 in Germany‘.

Calc vs. Excel in this regard

Having used only Excel 2010 so far for the purpose of pivoting I must say that I am very positively surprised! I think the only feature seriously missing is an ability to create on fields and elements. This can be compensated though. The layout dialogue in Calc is also easier to understand for beginners as it is clearer how to get to a certain table layout.