Skip to content

Ignore globally cached images in PartialEvaluator.getTextContent (PR 11930 follow-up) #12922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Snuffleupagus
Copy link
Collaborator

Given that we'll only cache /XObjects of the Image-type globally, we can utilize that in PartialEvaluator.getTextContent as well. This way, in cases such as e.g. issue #12098, we can avoid having to fetch/parse /XObjects that we already know to be Images. This is helpful, since Streams are not cached on the XRef instance (given their potential size) and the lookup can thus be somewhat expensive in general.

Also, skip a redundant RefSetCache.has check in the GlobalImageCache.getData method.

…R 11930 follow-up)

Given that we'll only cache `/XObject`s of the `Image`-type globally, we can utilize that in `PartialEvaluator.getTextContent` as well. This way, in cases such as e.g. issue 12098, we can avoid having to fetch/parse `/XObject`s that we already know to be `Image`s. This is helpful, since `Stream`s are not cached on the `XRef` instance (given their potential size) and the lookup can thus be somewhat expensive in general.

Also, skip a redundant `RefSetCache.has` check in the `GlobalImageCache.getData` method.
@Snuffleupagus Snuffleupagus force-pushed the getTextContent-globalImageCache branch from de4c448 to 72da2aa Compare January 28, 2021 09:19
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/6a4db0c597608c0/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://3.101.106.178:8877/9dfb4167eb8b191/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/6a4db0c597608c0/output.txt

Total script time: 26.83 mins

  • Font tests: Passed
  • Unit tests: FAILED
  • Integration Tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/6a4db0c597608c0/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://3.101.106.178:8877/9dfb4167eb8b191/output.txt

Total script time: 28.55 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED

Image differences available at: http://3.101.106.178:8877/9dfb4167eb8b191/reftest-analyzer.html#web=eq.log

@timvandermeij timvandermeij merged commit e4e92d1 into mozilla:master Jan 28, 2021
@timvandermeij
Copy link
Contributor

Good idea; thanks!

@Snuffleupagus Snuffleupagus deleted the getTextContent-globalImageCache branch January 29, 2021 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants