-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up) #13415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up) #13415
Conversation
632f6ab
to
0434bd0
Compare
From: Bot.io (Linux m4)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/e58289562fe5c9d/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://3.101.106.178:8877/be7c99392e34cc6/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.67.70.0:8877/e58289562fe5c9d/output.txt Total script time: 2.18 mins
|
…10274 follow-up) According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered. For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails. Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data. Instead we remove the current *limited* fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should *not* lead to any additional data fetching/parsing. Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.
0434bd0
to
8d56893
Compare
From: Bot.io (Windows)FailedFull output at http://3.101.106.178:8877/be7c99392e34cc6/output.txt Total script time: 3.72 mins
|
90053c5
to
8d56893
Compare
From: Bot.io (Linux m4)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/28d3c5cc1f8f508/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://3.101.106.178:8877/30933ee4a554762/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/28d3c5cc1f8f508/output.txt Total script time: 3.72 mins
|
From: Bot.io (Windows)SuccessFull output at http://3.101.106.178:8877/30933ee4a554762/output.txt Total script time: 7.28 mins
|
Good idea, and thanks for the extra test! |
According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails.
Previously, in PR #10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data.
Instead we remove the current limited fallback from
NameOrNumberTree.get
, and defer to the call-site to handle this case explicitly e.g. by usingNameOrNumberTree.getAll
for data where that makes sense. For well-formed documents, these changes should not lead to any additional data fetching/parsing.Finally, as part of these changes, the validation of named destination data is improved in the
Catalog
and a new unit-test is also added.