Skip to content

[api-minor] Improve performance with image masks (bug 857031) #14754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 9, 2022

Conversation

calixteman
Copy link
Contributor

  • it aims to partially fix performance issue reported: https://bugzilla.mozilla.org/show_bug.cgi?id=857031;
  • the idea is too avoid to use byte arrays but use ImageBitmap which are a way faster to draw:
    • an ImageBitmap is Transferable which means that it can be built in the worker instead of in the main thread:
      • this is achieved in using an OffscreenCanvas when it's available, there is a bug to enable them
        for pdf.js: https://bugzilla.mozilla.org/show_bug.cgi?id=1763330;
      • or in using createImageBitmap: in Firefox a task is sent to the main thread to build the bitmap so
        it's slightly slower than using an OffscreenCanvas.
    • it's transfered from the worker to the main thread by "reference";
    • the byte buffers used to create the image data have a very short lifetime and ergo the memory used is globally
      less than before.
  • Use the localImageCache for the mask;
  • Fix the pdf issue4436r.pdf: it was expected to have a binary stream for the image;
  • Move the singlePixel trick from operator_list to image: this way we can use this trick even if it isn't in a set
    as defined in operator_list.

@calixteman
Copy link
Contributor Author

For information, I tested with the pdf in https://bugzilla.mozilla.org/show_bug.cgi?id=878397.
I measured the memory used with the devtools in local Firefox build:

  • current: 24.44Mb when the pdf is just loaded and 18.50Mb after a GC
  • with the patch and OffscreenCanvas enabled: 10.07Mb after pdf loaded an 7.49Mb after a GC.

Copy link
Collaborator

@Snuffleupagus Snuffleupagus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this looks like a really nice way to improve things here, however there's unfortunately two problematic areas (which is why I holding off on doing a "proper" review).

For both Safari and Node.js, none of the functionality that this patch depends on is actually supported :-(

Hence it, most unfortunately, looks to me like we'll actually need to keep the old code around as a fallback and do something similar to this existing code in the putBinaryImageMask function as well.

@calixteman
Copy link
Contributor Author

Arf...
A way to workaround this without the need to maintain too much stuff could be to generate a PNG buffer in using (this is based on a @brendandahl's idea):

const convertImgDataToPng = (function () {

then convert into a blob and pass it to sendImgData where we can easily create an ImageBitmap.

Before using createImageBitmap I was sending to the main thread a buffer (consumed "immediately" in

messageHandler.on("obj", ([id, pageIndex, type, imageData]) => {
in using createImageBitmap) in case of no OffscreenCanvas and with the pdf from bug 878397 I noticed that before GC we have a huge memory consumption (more than 100Mb).
Maybe I won't have the same issue with a Blob else, as you mentioned, there still is this possibility to keep the current way to do.

@calixteman
Copy link
Contributor Author

I tested the PNG thing and it's a bit slower.
So I just added back the old way to do and slightly refactor to avoid code duplication (see shared/image_utils.js).

@calixteman calixteman requested a review from Snuffleupagus April 7, 2022 15:23
@marco-c
Copy link
Contributor

marco-c commented Apr 7, 2022

Is this also improving perf with the PDF from https://bugzilla.mozilla.org/show_bug.cgi?id=1135277?

Copy link
Collaborator

@Snuffleupagus Snuffleupagus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good in general, and keeping the old code as a fallback is a nice/simple solution for Node.js and older browsers.
However, there's a few things that I'd like to see changed, please see the inline comments.

Given that OffscreenCanvas is not enabled by default in Firefox, we're thus not testing that configuration as-is. That seems unfortunate, but I believe that we should be able to work-around this by adding gfx.offscreencanvas.enabled: true, to the list in

pdf.js/test/test.js

Lines 943 to 963 in 497c061

if (browserName === "firefox") {
options.extraPrefsFirefox = {
// avoid to have a prompt when leaving a page with a form
"dom.disable_beforeunload": true,
// Disable dialog when saving a pdf
"pdfjs.disabled": true,
"browser.helperApps.neverAsk.saveToDisk": "application/pdf",
// Avoid popup when saving is done
"browser.download.improvements_to_download_panel": false,
"browser.download.panel.shown": true,
// Save file in output
"browser.download.folderList": 2,
"browser.download.dir": tempDir,
// Print silently in a pdf
"print.always_print_silent": true,
"print.show_print_progress": false,
print_printer: "PDF",
"print.printer_PDF.print_to_file": true,
"print.printer_PDF.print_to_filename": printFile,
};
}

Copy link
Collaborator

@Snuffleupagus Snuffleupagus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me, with a couple of last comments and all tests passing; thank you!

Comment on lines 855 to 856
// Slow path: OffscreenCanvas or createImageBitmap are not
// available in the worker.
Copy link
Collaborator

@Snuffleupagus Snuffleupagus Apr 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the mention of createImageBitmap here, since that's no longer being used.

}
return;
}

imgData.cached = !!cacheKey;
Copy link
Collaborator

@Snuffleupagus Snuffleupagus Apr 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still matter here, since the actual imgData will be sent in a separate message (via this._sendImgData) and below args.data will always contain a string now?

'OperatorList._transfers: Unsupported "arg.data" type.'
);
}
if (!arg.cached) {
if (!arg.cached && arg.data && arg.data.buffer) {
Copy link
Collaborator

@Snuffleupagus Snuffleupagus Apr 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe if (!arg.cached && arg.data && arg.data.buffer instanceof ArrayBuffer) { to prevent any possible future issues (since arg.data will be a string in some cases)?


ctx.putImageData(imgData, 0, 0);
const bitmap = canvas.transferToImageBitmap();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I'm guessing that the OffscreenCanvas is no longer needed for anything once the ImageBitmap has been created?
If so, would it make sense to try and "dispose" of it immediately similar to what we do in other parts of the code-base; see e.g.

pdf.js/web/pdf_page_view.js

Lines 297 to 300 in 2b673a6

// Zeroing the width and height causes Firefox to release graphics
// resources immediately, which can greatly reduce memory consumption.
this.canvas.width = 0;
this.canvas.height = 0;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's useful, because we transfered the ownership of the underlying surface (in the ImageBitmap) associated with this canvas and the object itself should be destroyed at the end of the scope.
In following what happens when e.g. SetWidth is called:
https://searchfox.org/mozilla-central/rev/c0a97899404931861a4bbec21f55d3cd79b90bb8/dom/canvas/OffscreenCanvas.cpp#76
I don't see anything special which could indicate that's useful.
Anyway, I trust your judgement so if you think it's better, I'll add these lines.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really have no idea if this matters for OffscreenCanvas, I was simply asking based on a pattern we use elsewhere for "regular" canvases (originally added all the way back in PR #4920).

So, do whatever you see fit here :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrmuizel, do you know if setting width & height to 0 is useful for an OffscreenCanvas with a lifetime limited to a function scope ?

- it aims to partially fix performance issue reported: https://bugzilla.mozilla.org/show_bug.cgi?id=857031;
- the idea is too avoid to use byte arrays but use ImageBitmap which are a way faster to draw:
  * an ImageBitmap is Transferable which means that it can be built in the worker instead of in the main thread:
    - this is achieved in using an OffscreenCanvas when it's available, there is a bug to enable them
      for pdf.js: https://bugzilla.mozilla.org/show_bug.cgi?id=1763330;
    - or in using createImageBitmap: in Firefox a task is sent to the main thread to build the bitmap so
      it's slightly slower than using an OffscreenCanvas.
  * it's transfered from the worker to the main thread by "reference";
  * the byte buffers used to create the image data have a very short lifetime and ergo the memory used is globally
    less than before.
- Use the localImageCache for the mask;
- Fix the pdf issue4436r.pdf: it was expected to have a binary stream for the image;
- Move the singlePixel trick from operator_list to image: this way we can use this trick even if it isn't in a set
  as defined in operator_list.
@calixteman
Copy link
Contributor Author

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2022

From: Bot.io (Windows)


Received

Command cmd_test from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/275ba94df05d145/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2022

From: Bot.io (Linux m4)


Received

Command cmd_test from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/0b5bb9fbe5cbd15/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2022

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/0b5bb9fbe5cbd15/output.txt

Total script time: 24.95 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 13

Image differences available at: http://54.241.84.105:8877/0b5bb9fbe5cbd15/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2022

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/275ba94df05d145/output.txt

Total script time: 26.75 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 3

Image differences available at: http://54.193.163.58:8877/275ba94df05d145/reftest-analyzer.html#web=eq.log

@calixteman
Copy link
Contributor Author

calixteman commented Apr 9, 2022

The failure in integration tests is unrelated to this patch.
And the few "regressions" are from my pov an improvement and are related to the change about 1x1 opaque masks.
@Snuffleupagus does it work for you ?

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Apr 9, 2022

And the few "regressions" are from my pov an improvement and are related to the change about 1x1 opaque masks.
@Snuffleupagus does it work for you ?

Generally yes, but why was modifying the issue4436r.pdf test-file necessary here? And, importantly, does the "new" file still trigger the same exact code-paths as before?

Note that the reduced test-case was originally created based on the information in #4436 (comment).

@calixteman
Copy link
Contributor Author

calixteman commented Apr 9, 2022

The stream for the image contains the 4 chars 0x00 and since the image is 1x1 then only the char "0" is taken which correspond to 48.
The 1x1 stuff I moved was executed only if the element was 0 for a 1x1 image so it wasn't executed.
Now only the first bit is considered, so the trick is correctly executed.
So I guessed that 0x00 was for a null byte, hence I replaced the 4 bytes by NULL (to avoid to have to rewrite the ref table).
So as far as I can tell, the previous test case was not triggering the code path it was supposed to test.
It'll be ok to let it like it is, I mean with the 0x00, because the first bit of 48 is 0 and we'll have the same "regression", but I thought that it was better to have a test with the correct value.

@Snuffleupagus
Copy link
Collaborator

The stream for the image contains the 4 chars 0x00 and since the image is 1x1 then only the char "0" is taken which correspond to 48. The 1x1 stuff I moved was executed only if the element was 0 for a 1x1 image so it wasn't executed. Now only the first bit is considered, so the trick is correctly executed. So I guessed that 0x00 was for a null byte, hence I replaced the 4 bytes by NULL (to avoid to have to rewrite the ref table). So as far as I can tell, the previous test case was not triggering the code path it was supposed to test. It'll be ok to let it like it is, I mean with the 0x00, because the first bit is 0 of 48 and we'll have the same "regression", but I thought that it was better to have a test with the correct value.

Thanks for the thorough explanation, this seems like a good improvement here!

(I suppose this just shows that trying to create a reduced test-case without access to the original document is rarely a good idea...)

@calixteman calixteman merged commit 08e1abe into mozilla:master Apr 9, 2022
@calixteman
Copy link
Contributor Author

/botio makeref

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2022

From: Bot.io (Linux m4)


Received

Command cmd_makeref from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/0d903dd70b30e08/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2022

From: Bot.io (Windows)


Received

Command cmd_makeref from @calixteman received. Current queue size: 1

Live output at: http://54.193.163.58:8877/2c0a8b92f9a84fe/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2022

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/0d903dd70b30e08/output.txt

Total script time: 22.00 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2022

From: Bot.io (Windows)


Success

Full output at http://54.193.163.58:8877/2c0a8b92f9a84fe/output.txt

Total script time: 21.49 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@Snuffleupagus Snuffleupagus changed the title Improve performance with image masks (bug 857031) [api-minor] Improve performance with image masks (bug 857031) Apr 15, 2022
calixteman added a commit to calixteman/pdf.js that referenced this pull request Feb 15, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Feb 15, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Feb 15, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Feb 17, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Feb 20, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Feb 22, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Feb 28, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Feb 28, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Mar 1, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Mar 1, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Mar 1, 2023
We introduced the use of OffscreenCanvas in mozilla#14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants