There aren’t many cases where I would produce an Excel file with PHP – as almost always there is a better way to get your data where you need it. If I needed PHP processed data in an Excel file I would clearly recommend to feed the data into a database and fetch it using PowerPivot. But still I had and have occasional situations where taking this route is the most comfortable and / or efficient. Luckily it is actually pretty simple when you use PHPExcel. Considering that PHP is a bit odd and Excel is also a bit odd and open source software tends to be very individual (odd) it is no surprise that PHPExcel is also a tad odd – but after you got used it, you have a great tool at hand for handling Excel files with PHP.
Exempli gratia
Let’s assume we have table on sheet ‘data’ with three columns ‘date’, ‘x’ and ‘y’. ‘date’ keeps the days of June 2012 (sorry for the wrong formatting in the screen shot, will fix it), ‘x’ the numbers from 1 to 30 and ‘y’ the square root (‘Wurzel’ in german) of the adjacent cell in the ‘x’ column.
The script then reads the Excel file, copies the table to a new sheet ‘new data’, sums the values of ‘x’ and ‘y’ in the column ‘x+y’ and saves the new version into a new Excel file.
So in the end the script will cover most features of PHPExcel that you are ever going to need to get some serious work done. Here we go:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
<?php require_once "./phpexcel/Classes/PHPExcel.php"; // factory for the appropriate Excel file reader $reader = PHPExcel_IOFactory::createReader("Excel2007"); // loads the whole workbook into a PHP object $excelWorkbook = $reader->load("test.xlsx"); // makes the sheet 'data' available as an object $sheet = $excelWorkbook->getSheetByName("data"); // the row keeping the table headers in my example $headRow = 4; $data = array(); for($i = 1; $i <= 30; $i++){ $row = $i + $headRow; // converts the numeric date representation used in Excel // into a DateTime object $date = PHPExcel_Shared_Date::ExcelToPHPObject( // returns the value of cell "Bx" $sheet->getCell("B{$row}")->getValue() )->format("Y-m-d"); $data[$date] = array( "x" => $sheet->getCell("C{$row}")->getValue(), // cells in column D keep a formula referencing cells in // column C. So if I would use getValue(), then I'd receive // the formula, but I want the value of the evaluated formula. "y" => $sheet->getCell("D{$row}")->getFormattedValue() ); } // lets create a new sheet 'new data' and add it to the workbook $newSheet = new PHPExcel_Worksheet($excelWorkbook, "new data"); $excelWorkbook->addSheet($newSheet); // lets place the two column headers in the first row $newSheet->setCellValue("A1", "date"); $newSheet->setCellValue("B1", "x+y"); $i = 0; foreach ($data as $date => $values) { // first row has index 1 $row = 1 + (++$i); // transform the date key into a DateTime object $date = DateTime::createFromFormat("Y-m-d", $date); // just writing the date to a cell would lead to a string // value in the Excel cell. We want a date value. So we use the // following static method for transforming the date value // into an object Excel knows how to deal with. $excelDate = PHPExcel_Shared_Date::PHPToExcel($date->getTimestamp()); // when writing to cells iteratively it is more convenient to // reference the cells with numeric coordinates $newSheet->setCellValueByColumnAndRow(0, $row, $excelDate); // here we tell Excel how to format / eventually display the date $newSheet->getStyleByColumnAndRow(0, $row)->getNumberFormat()->setFormatCode( PHPExcel_Style_NumberFormat::FORMAT_DATE_DDMMYYYY ); // just summing the x and y column $newSheet->setCellValueByColumnAndRow(1, $row, array_sum($values)); } // save the Excel object to an Excel 2007 type Excel workbook $xlsx = new PHPExcel_Writer_Excel2007($excelWorkbook); $xlsx->save("new_test.xlsx"); |
Excel file types
Office files contained before version 2007 binary data. With version 2007 Microsoft switched to zip compressed XML documents. That’s the reason for why there are reader and writer classes for Excel 2007 and Excel 5.
Issue with cell color formatting
When using PHPExcel to produce the character maps for my first data article I found that I just couldn’t apply conditional formatting of the cell color background. I describe the bug symptoms in this Stackoverflow question – but anyway the solution to this problem is to select all cells and choose ‘none’ as background color. After doing so the coloring issues are resolved.
Table references of [col] type not handled propperly
In the above example I first had the cell area formatted as an actual table. When you are in such a table referencing a cell of the same row then a square bracket notation is used with the column’s name between the brackets. PHPExcel then gave me an error for the line where I read the evaluated formula’s value for a cell. When you replace the bracket type referencing with the usual A1 notation then the evaluated value is readable again.
PHPExcel can’t handle pivot tables
I once tried to produce ready to read reports with PHPExcel where you have the charts and pivot tables on one sheet referencing the actual data on another. My idea was to construct the dashboard part as usual and have PHPExcel just update the source data on the separate sheet. Sadly this didn’t work and I still can’t make it work. After loading an Excel file with pivot tables and writing it into a file again, the pivot table is gone – even if it resides on a sheet you didn’t touch with PHPExcel.
Memory issues
For the above described scenario I eventually came up with an indirect solution where I have PHPExcel create a source file, while the dashboard with the pivot tables resides in a separate file importing the data from the source. That works all nice and dandy but for really big data sets I always faced a memory limit when handling such files. If you create a file with 100’000 records consisting of 10 fields (which sounds much but isn’t really), then PHPExcel has to represent all those 1’000’000 cells in memory. According to what I read, PHPExcel reserves for one cell about 1KB … so this (not really) big table consumes already about 1 GB of your memory. In such a case always resort to PowerPivot if you have to use Excel.
Another example for using PHPExcel ist the script that creates the frequency map for the character combinations.
Thanks