Yesterday I was willing to invest a couple of hours to install Excel 2013 Home Premium and give PowerPivot a try using the largest dataset I have available locally. A data set from Eurostat keeping passenger and flight volume data going between pretty much all European airports and a partner country on a monthly basis. Extracted from SDMX and loaded into a Postgres database – the table keeps more than 8 million records already provided in long format. Okay – well, it’s not super huge but large enough to get a feeling about performance and usability of a tool. So I was looking forward to PowerPivot-driven data mining and in for a very bad surprise!
Sometimes it happens that PowerPivot crashes and is not loaded again automatically when starting Excel. Here is how you get PowerPivot back on track.
- Open the “File” section from the ribbon
- Choose “Options”
- Choose “Add-Ins”
- At the bottom on the right hand side you’ll find a select box labelled “Manage”
- Choose “Disabled Items”
- Click “Go …”
- Choose the item referring to PowerPivot
- Click “Enable”
- Restart Excel
That’s it. In my experience no handicaps are to be expected. The crashes seem pretty random to me. Like a sudden nervous breakdown – just that Excel doesn’t need time-off to recover.
KPIs for a grocery store
I think this topic is introduced best using an example. Let’s assume an offline store and we are responsible for the data analysis. The boss wants us to count the people visiting the shop considering their age group and also how many of those visits lead to a purchase.
PowerPivot itself cannot be automated with VBA. But with a few tricks we can control how the data drawn from PowerPivot is accessed within Excel. This article assumes that you know how to import data with PowerPivot and how to integrate it into an Excel file as a pivot table. Also you should have activated the developer tools in Excel. There you find all the VBA relevant stuff like the macro recorder and the dialogue controls you want to add.
Let’s get it started with an example
We start with a PowerPivot table (named “data”) keeping the values for two variables A and B for every day in 2012 imported from a MySQL-DB in this case. So in the PowerPivot window this would look something like that:
So let’s assume some manager is asking for a dashboard displaying those two values in a chart but only one at a time and s/he is not inclined to deal with pivot table field lists but wants to switch between A and B using buttons.
“The operation has been cancelled because there is not enough memory available for the application.”
Recently I experienced occasional errors regarding lack of memory when trying to import big data sets with PowerPivot. I am using Excel 32bit so the maxium theoretical in-memory size of an Excel file is 4GB. But judging from the memory usage displayed in the task manager it quickly became clear that I wasn’t even getting close to this boundary.
Pivot tables are in contrast to popular belief not just an Excel feature but present in a lot of applications dealing with tabularly structured numerical data – pivot tables are the visual and interactive result of a general concept of data aggregation depending on categories. LibreOffice (check out the article on it), OpenOffice and R by libraries like reshape offer this technology targeting at a different audience. If you understand pivoting in one of these tools, you pretty much got it for all of them.
At my work I am offering a workshop on pivoting with Excel 2010 and so I came up with a chart boiling down the concept of using pivot tables so it fits on a sheet of PDF.
When you are responsible for setting up or extending an automized reporting of data stored in a warehouse with Excel, I would always vote for using PowerPivot for fetching the data and displaying it in pivot charts and tables. But there are use cases and situations where PowerPivot is not an option. Maybe you just cannot use PowerPivot for whatever reason. But first and foremost I met / suffered from the case where you already have an Excel reporting using cell references to other files and macros to fill the cells. If this sort of reporting reaches a certain extent and importance you cannot just set it up new again. Now if you need to extend such a report f.x. with a new column and don’t want to use cell references to other files – or worse VBA – because this is very fragile and error prone then I have a solution for you.
And this is how it works
Setting up PowerPivot – that means all the way from downloading it and successfully fetching data – keeps a number of pitfalls in readiness. To spare you the sweat I will give you some guidance on this topic. I assume that you are using Excel 2010 on Windows 7 and that you want to connect to MySQL 5.
There aren’t many cases where I would produce an Excel file with PHP – as almost always there is a better way to get your data where you need it. If I needed PHP processed data in an Excel file I would clearly recommend to feed the data into a database and fetch it using PowerPivot. But still I had and have occasional situations where taking this route is the most comfortable and / or efficient. Luckily it is actually pretty simple when you use PHPExcel. Considering that PHP is a bit odd and Excel is also a bit odd and open source software tends to be very individual (odd) it is no surprise that PHPExcel is also a tad odd – but after you got used it, you have a great tool at hand for handling Excel files with PHP.
Managers like Excel and that’s why you will have to use this application when dealing with or being responsible for data and figures in your company. And actually I think Excel is not that bad at all – as long as you know what to use it for and when to resort to other tools. What I like about Excel the most is for one the pivoting tool which is very convenient and efficient to use when you have the data structured appropriately. Another thing is that Excel provides a very convenient and powerful bridge between the data and its representation in dashboards and reports ready for the C[here comes a capitalized letter]O. This bridge is mainly built of different ways to reference and process individual cells and groups of cells and aggregate and arrange their values in a way appealing to the eye – or worse turn it into a PowerPoint slide.