Managers like Excel and that’s why you will have to use this application when dealing with or being responsible for data and figures in your company. And actually I think Excel is not that bad at all – as long as you know what to use it for and when to resort to other tools. What I like about Excel the most is for one the pivoting tool which is very convenient and efficient to use when you have the data structured appropriately. Another thing is that Excel provides a very convenient and powerful bridge between the data and its representation in dashboards and reports ready for the C[here comes a capitalized letter]O. This bridge is mainly built of different ways to reference and process individual cells and groups of cells and aggregate and arrange their values in a way appealing to the eye – or worse turn it into a PowerPoint slide.
Natural limitations of Excel
So far so good! But where does the data on the dark side of the Excel file come from? Sometimes a pitiful person has to fill the cells manually – sometimes cells in remote Excel files are referenced or a VBA script is fetching the data from a web site or a database. This is all nice and dandy as long as we are talking about few hundred cells in the first case or less than a million rows (practically way less actually) in the latter. The case where a large amount of cells from remote Excel files is referenced is anyway less about technological knowledge then it is about a strong believe in god and sending a prayer towards the sky every time you refresh the data.
And finally Microsoft’s answer to it – PowerPivot
Basically processing large amounts of data was virtually impossible before Microsoft blessed the universe with PowerPivot (now available in version 11). Thanks to PowerPivot you can fatten an Excel file until it reaches the natural limit dictated by your file system (at least in theory). Also the data is handled much more efficiently in my experience and is compressed. What I would recommend though is to stack up the RAM because pivoting over one GB of data is no fun when half of it is swapped on your disk.
The best feature of PowerPivot though is that you can now query the DBMS of your choice using SQL without having to resort to VBA – and anyway, without this extension the amount of processible data would be still severly restricted. Together with PowerPivot a new formula language called DAX (Data Analysis Expressions) is introduced which allows for more sophisticated and fine grained aggregations in pivoting tables.
No API for VBA available
One annoying handicap though is that PowerPivot is not programmble through VBA. So for refreshing a dashboard you always have to go launch the PowerPivot extension, refresh/fetch the data there and then go back to Excel to refresh the spreadsheet. For me as an IT guy this is no reason to worry but it makes it uneasy for a usual manager who wants to update his dashboard. The reason for this is that PowerPivot is not primarily intended to be used as a simple extension for Excel but rather the connecting link between Excel, SQL Server and Sharepoint. But Microsoft seems to have realized that most users are interested in using it independant of Sharepoint. So there are some rumors that this might be solved in future versions – as a matter of fact PowerPivot is integrated directly into Excel 2013.
Using PowerPivot in the real world
PowerPivot is certainly no tool of unlimited joy and fun to work with – but if you let yourself into it and learn how to avoid or work around its quirks then you can get quite decent processing of large amounts of data done with it. Just to give you an idea; the biggest set of data I had to deal with so far was a table of 1.5 million records each composed of about twenty fields plus some more tables keeping up to 100’000 records. All that stuff presented in an interactive pivoting-driven dashboard, updated on a daily basis from a MySQL database – no big deal! To make a long story short – a limit is not yet in sight.
One of the next articles will be about how to set up PowerPivot in Excel 2010 and an ODBC driver for MySQL on Windows 7 64 bit – which is necessary to make PowerPivot talk to MySQL.
(If PowerPivot is not an option for you Excel reporting then this article might be of interest to you.)