Skip to content

Commit b211762

Browse files
authored
Allow Xlsx Reader to Specify ParseHuge Release210 (#4516)
* Allow Xlsx Reader to Specify ParseHuge Release210 Fix #4260. A number of Security Advisories related to libxml_options were opened. In the end, we disabled the ability to specify any libxml_options. However, some users were adversely affected because they needed LIBXML_PARSEHUGE for some of their files. Having finally obtained access to a file demonstrating this problem, we can restore this ability. - The operation is potentially dangerous, a vector for memory leaks and out-of-memory errors. It is not recommended unless absolutely needed. - It will not be permitted as a global (static) property with the ability to adversely affect other users on the same server. - It will instead be implemented as an instance property of Xlsx Reader (default to false), with a setter. I do not see a use case for a getter. - People will need to set this property individually for each file which they think needs it. - This change will be backported to all supported releases. - The sheer size and processing time for the file involved makes it impractical to add a formal test case. It has, nevertheless, been tested satisfactorily. * Update CHANGELOG.md
1 parent 2219ded commit b211762

File tree

2 files changed

+25
-7
lines changed

2 files changed

+25
-7
lines changed

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com)
66
and this project adheres to [Semantic Versioning](https://semver.org).
77

8-
# TBD - 2.1.10
8+
# 2025-06-22 - 2.1.10
99

1010
### Changed
1111

@@ -19,6 +19,7 @@ and this project adheres to [Semantic Versioning](https://semver.org).
1919

2020
- TEXT and TIMEVALUE functions. [Issue #4249](https://github.com/PHPOffice/PhpSpreadsheet/issues/4249) [PR #4353](https://github.com/PHPOffice/PhpSpreadsheet/pull/4353)
2121
- Removing Columns/Rows Containing Merged Cells. Backport of [PR #4465](https://github.com/PHPOffice/PhpSpreadsheet/pull/4465)
22+
- Allow Xlsx Reader to Specify ParseHuge. [Issue #4260](https://github.com/PHPOffice/PhpSpreadsheet/issues/4260) [PR #4516](https://github.com/PHPOffice/PhpSpreadsheet/pull/4516)
2223

2324
# 2025-02-07 - 2.1.9
2425

src/PhpSpreadsheet/Reader/Xlsx.php

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,19 @@ class Xlsx extends BaseReader
5757

5858
private array $sharedFormulae = [];
5959

60+
private bool $parseHuge = false;
61+
62+
/**
63+
* Allow use of LIBXML_PARSEHUGE.
64+
* This option can lead to memory leaks and failures,
65+
* and is not recommended. But some very large spreadsheets
66+
* seem to require it.
67+
*/
68+
public function setParseHuge(bool $parseHuge): void
69+
{
70+
$this->parseHuge = $parseHuge;
71+
}
72+
6073
/**
6174
* Create a new Xlsx Reader instance.
6275
*/
@@ -118,8 +131,8 @@ private function loadZip(string $filename, string $ns = '', bool $replaceUnclose
118131
}
119132
$rels = @simplexml_load_string(
120133
$this->getSecurityScannerOrThrow()->scan($contents),
121-
'SimpleXMLElement',
122-
0,
134+
SimpleXMLElement::class,
135+
$this->parseHuge ? LIBXML_PARSEHUGE : 0,
123136
$ns
124137
);
125138

@@ -133,8 +146,8 @@ private function loadZipNonamespace(string $filename, string $ns): SimpleXMLElem
133146
$contents = $this->getFromZipArchive($this->zip, $filename);
134147
$rels = simplexml_load_string(
135148
$this->getSecurityScannerOrThrow()->scan($contents),
136-
'SimpleXMLElement',
137-
0,
149+
SimpleXMLElement::class,
150+
$this->parseHuge ? LIBXML_PARSEHUGE : 0,
138151
($ns === '' ? $ns : '')
139152
);
140153

@@ -248,7 +261,9 @@ public function listWorksheetInfo(string $filename): array
248261
$this->zip,
249262
$fileWorksheetPath
250263
)
251-
)
264+
),
265+
null,
266+
$this->parseHuge ? LIBXML_PARSEHUGE : 0,
252267
);
253268
$xml->setParserProperty(2, true);
254269

@@ -1961,7 +1976,9 @@ private function readRibbon(Spreadsheet $excel, string $customUITarget, ZipArchi
19611976
// exists and not empty if the ribbon have some pictures (other than internal MSO)
19621977
$UIRels = simplexml_load_string(
19631978
$this->getSecurityScannerOrThrow()
1964-
->scan($dataRels)
1979+
->scan($dataRels),
1980+
SimpleXMLElement::class,
1981+
$this->parseHuge ? LIBXML_PARSEHUGE : 0
19651982
);
19661983
if (false !== $UIRels) {
19671984
// we need to save id and target to avoid parsing customUI.xml and "guess" if it's a pseudo callback who load the image

0 commit comments

Comments
 (0)