在PHPExcel论坛上有很多关于PHPExcel的内存使用情况的文章;因此,通读之前的一些讨论可能会给您一些想法。PHPExcel持有电子表格的“内存中”表示形式,并且容易受到PHP内存限制的影响。
文件的物理大小在很大程度上是无关紧要的...更重要的是要知道它包含多少个单元格(每个工作表上的行*列)。
我一直使用的“经验法则”是平均约1k /单元格,因此5M单元格工作簿将需要5GB内存。但是,您可以通过多种方式减少该要求。这些可以组合,具体取决于您需要在工作簿中访问的确切信息以及您希望对其执行的操作。
如果您有多个工作表,但不需要加载所有工作表,则可以使用 setLoadSheetsOnly() 方法限制 Reader 将加载的工作表。加载单个命名工作表:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #2';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader of which WorkSheets we want to load **/
$objReader->setLoadSheetsOnly($sheetname);
/** Load $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
或者,您可以通过传递名称数组来指定多个工作表,并一次调用 setLoadSheetsOnly():
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
$sheetnames = array('Data Sheet #1','Data Sheet #3');
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader of which WorkSheets we want to load **/
$objReader->setLoadSheetsOnly($sheetnames);
/** Load $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
如果您只需要访问工作表的一部分,则可以定义读取过滤器以仅确定实际要加载的单元格:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #3';
/** Define a Read Filter class implementing PHPExcel_Reader_IReadFilter */
class MyReadFilter implements PHPExcel_Reader_IReadFilter {
public function readCell($column, $row, $worksheetName = '') {
// Read rows 1 to 7 and columns A to E only
if ($row >= 1 && $row <= 7) {
if (in_array($column,range('A','E'))) {
return true;
}
}
return false;
}
}
/** Create an Instance of our Read Filter **/
$filterSubset = new MyReadFilter();
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader of which WorkSheets we want to load
It's more efficient to limit sheet loading in this manner rather than coding it into a Read Filter **/
$objReader->setLoadSheetsOnly($sheetname);
echo 'Loading Sheet using filter';
/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($filterSubset);
/** Load only the rows and columns that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
使用读取筛选器,您还可以按“块”读取工作簿,以便在任何时候只有一个块驻留在内存中:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example2.xls';
/** Define a Read Filter class implementing PHPExcel_Reader_IReadFilter */
class chunkReadFilter implements PHPExcel_Reader_IReadFilter {
private $_startRow = 0;
private $_endRow = 0;
/** Set the list of rows that we want to read */
public function setRows($startRow, $chunkSize) {
$this->_startRow = $startRow;
$this->_endRow = $startRow + $chunkSize;
}
public function readCell($column, $row, $worksheetName = '') {
// Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
return true;
}
return false;
}
}
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Define how many rows we want to read for each "chunk" **/
$chunkSize = 20;
/** Create a new Instance of our Read Filter **/
$chunkFilter = new chunkReadFilter();
/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($chunkFilter);
/** Loop to read our worksheet in "chunk size" blocks **/
/** $startRow is set to 2 initially because we always read the headings in row #1 **/
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
/** Tell the Read Filter, the limits on which rows we want to read this iteration **/
$chunkFilter->setRows($startRow,$chunkSize);
/** Load only the rows that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
// Do some processing here
// Free up some of the memory
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel);
}
如果您不需要加载格式信息,而只需要加载工作表数据,则 setReadDataOnly() 方法将告诉读者仅加载单元格值,忽略任何单元格格式设置:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader that we only want to load cell data, not formatting **/
$objReader->setReadDataOnly(true);
/** Load $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
使用单元格缓存。这是一种减少每个单元所需的PHP内存的方法,但速度要付出代价。它的工作原理是以压缩格式存储单元格对象,或者在PHP的内存之外(例如磁盘,APC,memcache)...但是您节省的内存越多,脚本的执行速度就越慢。但是,您可以将每个单元所需的内存减少到大约300字节,因此假设的5M单元将需要大约1.4GB的PHP内存。
开发人员文档的 4.2.1 节中介绍了单元缓存
编辑
查看你的代码,你正在使用迭代器,它不是特别有效,并构建了一个单元格数据数组。你可能想看看 toArray() 方法,它已经内置在 PHPExcel 中,并为您执行此操作。另请查看最近关于 SO 的讨论,讨论新的变体方法 rangeToArray() 以构建行数据的关联数组。