Eurostat Basics in Action (Unknown Causes of Death)

Eurostat is the institution within the European Union that organizes statistics from the 27 EU member states (f.x. from the German Federal Office of Statistics who also maintain a web-access to their data). Their web-site offers a wealth of statistics, reports, documents and visulization tools. It is pretty huge and I still get lost easily on it or discover new things. So this article doesn’t even try to show you around. I’ll just exemplify here one aspect of their site – the statistics database in context of a concrete question. In case you like population statistics thrown on maps you might be interested in the following articles which use data from Eurostat:

The question we’ll investigate

How regularly did people – differentiated in younger than or at least 65 years of age – in recent past die from a cause categorized as “Ill-defined and unknown causes of mortality”? We will be looking at national level (NUTS 0).

Okay, now here is how we answer that question

1) On the Eurostat Homepage you click the link labeled “Statistics Database“.

2) What you see now is a tree structure looking like something that was fancy 10 years ago. It consists of five main sections of which I almost exclusively use the first one – “Database by themes”. This section gathers as far as I know all the available data in its most versatile organisation.

Statistics database on Eurostat

3) The way we take you can see on the screenshot. What you can also see is a cryptical character sequence in brackets – “hlth_cd_ysdr1” – that’s the name of the table we are interested in. It means something like “health – causes of death – yearly standardized death rates” (I guess). So to make a long story short we type that into the input field above the tree and just search for it. We follow the little yellow arrows and click it.

4) A new window pops open displaying the table according to the default settings – a different death cause – regions down to NUTS 2. Not what we want. So we have to apply some adjustments. For that purpose we click on the tab labelled “Select Data”.

5) Within the different categories we select the attributes we care about (and after we are done with one category we click the link labelled “Update”):

      • AGE: select all three attributes
      • GEO:
        • “Show all”
        • “select all” then “unselect all”
        • choose “Nuts level” and “NUTS 0”
        • “Search” then “select all”
      • ICD10:
        • “select all” then “unselect all”
        • choose the one with code “R96-R99
      • SEX: “Total”
      • TIME: “2008_2010”
      • UNIT: “Standardised death rate”

6) We’re done with settings so we go back to the table via “View Table”.

7) Now we have the table showing the countries row wise and the time in the columns. But we want the age groups in the columns instead. So what we do is we drag the lightblue cross next to “AGE” on the “TIME” label in the table. That’s it.

Unknown death causes / Eurostat

Why this statistic?

Now you might be wondering why I chose this data set. Just like rates of children born dead  this death rate can be considered as an indicator for the quality of the medical infrastructure of a country or region. I mean think about it – what does it say about the medical system f.x. in Greece* when more than three times as many old people die of unknown (uncared?) causes compared to Switzerland. If doctors don’t know what somebody died of then this might be because they didn’t have the time to examine a patients health when she or he was still alive in the first place.

Then again some observations are indicating that the interpretation is not as simple – Norway ranking pretty high is known for very high living conditions while Romania rather isn’t.

*: Make sure you read Isabel’s very insightful comment on how to use Metadata provided by Eurostat – specifically with respect to Greece!

(original article published on

6 thoughts on “Eurostat Basics in Action (Unknown Causes of Death)

  1. Hello! And Happy New Year!

    I came to your website via R-bloggers! ;)

    I’m not quite sure what you are saying in regards to Romania in this post, but I think it is something like one would expect more unknown causes of death in country like ours, is that correct?

    Well, I think I might have the explanation(s). First of all, autopsies are mandatory for all deaths of unknown cause. Yes, MANDATORY. Most very sick people die either in hospital in which case the autopsy is almost always carried out – it is a very big hassle for the family to try and refuse an autopsy and such a pass is only granted if the cause of death is extremely self-evident.

    Deaths at home have to be very well monitored by the general practicioner that is responsible for a certain village/area of a city. In such cases, the death certificate is issued only in cases that the GP knows very well and is aware of a underlying terminal illness. In the very rare cases other than this, the GP alerts the authorities and an autopsy is carried out.

    All this might make it sound like we have a pretty good medical system over here. Not really… Take a look at number of deaths of preventable cause or the mean life expectancy. That is where our light truly “shines”. So the low number of uknown causes of death might be due to the fact that diseases are diagnosed but very poorly managed.

    What a great start for the new year! :D

    Nice site you have here, keep up the good work!



    • Salut Alex,

      thanks for the kind words and a happy new year to you, too!

      Very interesting information – thanks for sharing! Yes, indeed I would expect the unknown causes of death to be roughly inversely proportional to the GDP per head. Then again though, I think reality is more complex as often a lot of potential causes could be attributed to a death. F.x. deaths related to alcohol or drug abuse probably offer a lot of damaged organs for candidates deserving the credit.

      Is there a reason why tracking down the cause of death is handled so strictly in Romania? Given that it causes expenses and binds resources.



      • Hello :)
        I don’t know the exact reason why they go to the trouble of carrying out the autopsies. I suspect that it is because of the former communist regime. It might seem like a long time ago. 25 years since the dictators were executed. But then again, I’m a bit younger than that so you can get some perspesctive on how people of different ages measure time. My point is, even though it might seem like a distant memory, medical staff that were beginning their career in 1989 are still active now and still have at least a decade to go until they retire. During communist times there was a crazy effort to make RO have at least 25 million people. So every cause of death was investigated extremely thorougly and there was (and still is) some sort of primal fear in every doctor’s mind (at least the older ones) of not knowing what happened to the pacient. It is simply a matter of the system having been set up in such a way in the past, and nobody has gotten to changing it yet. Maybe it is not a bad thing to know the cause of death, but it would certainly be better if we could provide good healthcare outside the large cities as well.

        Mind you, that carrying out autopsies doesn’t necesarilly mean that they are carried out with extreme detail and attention. Especially since there is no longer the threat of the Securitate ( the former secret police in coomie years) throwing doctors in jail. I have a sneaking suspicion that some autopsies fabricate causes of death such as “heart failure” “stroke” “myocardial infarction” “respiratory arrest”, just to get it over with, but this is not what happens in the majority of cases.

        You might be wondering why the crazy amount of knowledge about autopsies. Rest assured, I am no psychopath on the loose, I’m a final year med student.

        But the extent to which this might happen (fabrication of results) might be different. If you ever find yourself with free time on your hands I would suggets looking at statistical reports from somewhere you trust, like Germany for example (or somewhere else, but whenever romanians think of trust, efficiency and high levels of skill, Germany springs to mind. We even elected a president using this positive stereotype :D).
        Now, you could compare the situation in east germany with the romanian statistics. In consecutive years, proportions of causes of death should be the same, especially when it comes to the top 3-4-5 causes: Something 1.heart disease 2. cancer etc 4 or 5. accidental death and so on and so forth.
        Now, if the relative amount of th top 3-4 causes of death fluctuates betwen years, that is an indication of poor gathering and reporting of results because of lazzines, fabrication of results, and so on. When one year the top cause of death is heart disease and the next it becomes stroke and heart disease falls to third place, you know that that is impossible and is impossible and most likely a mistake ( of the results of a series of mistakes.)

        Your are right, reality is more complex than any model one might fit to data. And as I have explained, there is no model that can take into account nightmares of communism, lazzines or other such ethereal entities in a meaningful way.

        But here is one more quick caveat about interpreting health/wellbeing data. I have recently read “Prosperity without Growth” (it is quite interesting, I recommend it) and I came across some interesting graphs. Once income per capita exceeds roughly 18,000 $ at current purchasing power parity, life expectancy, preventable causes of death and so forth stop improving. No matter if the income per capita is 30 000 $ or 12 000 $, the curve becomes a flat. Neddless to say, there is almost a vertical increase until about 8 000 $ PPP/per capita, then the curve slows and flattens above 18 000-20 000 $. You can visualize this and all sorts of interesting stats at the Gapminder website

        Have a nice day! :)

        • Alex, you made my day! And good point you made with wealth per head not being simply linearly related to wellbeing. Totally second that! (Great visualization by the way – offering lot of food for thought.)

          Personally I hate this notion of an economy’s prosperity being fundamentally dependent on growth – I think this is the core economic malfunction of the 21st century. It’s going to ruin us if we don’t overcome it.

          Maybe you can have a look at a rather old post of mine, if you find the time:

          Increase of Deaths Due to Viral Hepatitis in Germany 1998

          Would love to read your opinion on that!

          Cheers, man

  2. The joy of metadata

    Hello, while I was reading through this post I was thinking that maybe the reason why so many old Greek die of unreported causes is because they die so old that their doctor would think that old age is enough explanation.

    But then I thought “Wait! Eurostat is a serious statistical office. They are very strict about “no data without metadata”, so maybe they have also published an explanation somewhere. The screenshot with the dataset navigation tree shows a little ‘M-document’ symbol next to the parent branch for this data set, Causes of death (hlth_cdeath). That is the link to the ESMS metadata file* that applies to this branch and all its children nodes.

    In this case the link is:

    Under 16. Comparability we learn that:

    The comparability of the data across different countries is limited by the fact that the revision of classification used to collect information on underlying causes of death may be different. However, only one country (Greece) is currently still using the ninth revision of the ICD. Furthermore, not all countries apply the recommended WHO's updates.
    The coverage of residents dying abroad or non-residents dying in the reporting country can also affect the comparability among countries.

    So maybe there is a conversion issue that maps several Greek codes to a default “unknown” cause of death. We are also warned of other factors that make it difficult to compare data among countries.

    Now, the ESMS files might seem a bit too technical or specialised -something official statisticians will write to impress other official statisticians about how thorough and knowledgeable they are. This is why Eurostat also maintains an official website, a wiki, presenting all statistical topics in an easily understandable way: Statistics Explained (SE,

    There are two relevant articles in SE about causes of death that might apply to the example dataset:

    Causes of death statistics – people over 65:

    Health statistics at regional level, under the heading Causes of death:

    After reading all that we will certainly know much more about why the statistics are collected and how they are compiled and disseminated but we might still have some questions. Questions, comments and suggestions can be sent to the European Statistical Data Support via the user form:

    Sorry for the long post, but I thought it would complete the excellent original info.


    * ESMS Metadata files are used for describing the statistics released by Eurostat. ESMS are based on the Euro SDMX Metadata Structure (ESMS). It aims at documenting methodologies, quality and the statistical production processes in general (see for a full explanation of ESMS files).

    • Hello Isabel,

      thank you so much for taking the time to write this exceptionally insightful comment on how to make use of meta data provided by Eurostat! This might be my favorite comment so far!

      Merry Christmas and a happy new year!


Leave a Reply

Your email address will not be published. Required fields are marked *