Transforming an XML Document into a CSV using XMLStarlet

In this little tutorial I am going to describe a handy tool for transforming an XML document into a more easily processable CSV format. There are many ways of getting this job done – but most are more tedious than necessary (like writing a custom made RegEx parser – yuck!). Using XMLStarlet and XPath expressions this is going to be cinch. Let’s evaluate a number of typical XML data configurations and turn them into a flat CSV structure.

<key> value </key>

Solution:

XMLStarlet can do a lot of stuff – we want to use it for querying / selecting from an XML document which we denote by sel. -T or --text  tells xmlstarlet to output text instead of XML. -t -m /root/record  or  --template --match /root/record specifies the section (or template) of the XML document which we would like to match repetitively – which is every <record> -section below  <root> . -v  or  --value-of  followed by an XPath expression specifies the string which we would like to output line by line – hence   -n  or   --nl  for newline. test.xml is … correct!

The XPath expressions are concatted and separated with semicolons. I guess there is not much more to add really as XPath is best understood by staring at it and in case you have to write a custom XPath query this site featuring a whole lot of examples for XPath expressions is pretty helpful.

<tag name=”key”> value </tag>

Solution:

key[@name='C'] translates to “Get value of tag named ‘key’ if it features an attribute named ‘name’ valued ‘C'”.

<tag name=”key” val=”value” />

Solution:

key[@name='D']/@value) translates to “Get value of attribute named ‘value’ of tag named ‘key’ if that tag has an attribute ‘name’ with value ‘D'”.

<item> <key> K </key> <value> V </value> </item>

Solution:

item[key='E']/value translates to “Get value of tag ‘value’ below ‘item’ if this ‘item’ has an attribute named ‘key’ with value ‘E’.

<object> <K1> V1 </K1> <K2> V2 </K2> </object>

Solution:

Enjoy converting XML to CSV :)


(original article published on www.joyofdata.de)

4 thoughts on “Transforming an XML Document into a CSV using XMLStarlet

  1. Hey, thank you!

    What if the “record” has no id though?

    Like this:

    val_1C

    val_2C

Comments are closed.