Enforce PAGE-scrolling for very large/long documents (bug 1588435, PR 11263 follow-up) #14324

Snuffleupagus · 2021-11-29T13:22:12Z

This patch is essentially a continuation of PR #11263, which tried to improve loading/initialization performance of very large/long documents.

Note that browsers, in general, don't handle a huge amount of DOM-elements very well, with really poor (e.g. sluggish scrolling) performance once the number gets "large". Furthermore, at least in Firefox, it seems that DOM-elements towards the bottom of a HTML-page can effectively be ignored; for the PDF.js viewer that means that pages at the end of the document can become impossible to access.

Hence, in order to improve things for these very large/long documents, this patch will now enforce usage of the (recently added) PAGE-scrolling mode for these documents. As implemented, this will only happen once the number of pages exceed 15000 (which is hopefully rare in practice).
While this might feel a bit jarring to users being forced to use PAGE-scrolling, it seems all things considered like a better idea to ensure that the entire document actually remains accessible and with (hopefully) more acceptable performance.

Fixes bug 1588435, to the extent that doing so is possible since the document contains 25560 pages (and is 197 MB large).

…olbar.js` (PR 9877 follow-up) This was added on the assumption that the viewer would (eventually) start using the `PDFSinglePageViewer` for e.g. PAGE-scrolling mode and PresentationMode. However, having both a `PDFViewer` and a `PDFSinglePageViewer` side-by-side in the viewer would've been tricky to implement well, which is why PR 14112 implemented PAGE-scrolling for the general `BaseViewer` instead. Given that the default viewer is no longer (potentially) going to use `PDFSinglePageViewer`, there's code in the `SecondaryToolbar` (and related CSS rules) which is now unnecessary.

…PR 11263 follow-up) This patch is essentially a continuation of PR 11263, which tried to improve loading/initialization performance of *very* large/long documents. Note that browsers, in general, don't handle a huge amount of DOM-elements very well, with really poor (e.g. sluggish scrolling) performance once the number gets "large". Furthermore, at least in Firefox, it seems that DOM-elements towards the bottom of a HTML-page can effectively be ignored; for the PDF.js viewer that means that pages at the end of the document can become impossible to access. Hence, in order to improve things for these *very* large/long documents, this patch will now enforce usage of the (recently added) PAGE-scrolling mode for these documents. As implemented, this will only happen once the number of pages *exceed* 15000 (which is hopefully rare in practice). While this might feel a bit jarring to users being *forced* to use PAGE-scrolling, it seems all things considered like a better idea to ensure that the entire document actually remains accessible and with (hopefully) more acceptable performance. Fixes [bug 1588435](https://bugzilla.mozilla.org/show_bug.cgi?id=1588435), to the extent that doing so is possible since the document contains 25560 pages (and is 197 MB large).

Snuffleupagus · 2021-11-29T13:25:44Z

/botio-linux preview

pdfjsbot · 2021-11-29T13:25:45Z

From: Bot.io (Linux m4)

Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/96544d351ca1920/output.txt

pdfjsbot · 2021-11-29T13:30:15Z

From: Bot.io (Linux m4)

Success

Full output at http://54.241.84.105:8877/96544d351ca1920/output.txt

Total script time: 4.48 mins

Published

Viewer: http://54.241.84.105:8877/96544d351ca1920/web/viewer.html
Viewer (legacy): http://54.241.84.105:8877/96544d351ca1920/legacy/web/viewer.html

brendandahl · 2021-11-29T19:51:01Z

Thanks for looking into this. Another option I was considering is using what many of the infinite scroll libraries do, where you have a fixed number of page divs and as they go out of view when you scroll you recycle them and add them after the visible elements. The scroll bar can become jumpy with this approach, but I think it's also possible to create really big container that is the estimated size of all the pages.

brendandahl · 2021-11-29T19:51:54Z

Of course the above would be a much bigger undertaking.

Snuffleupagus · 2021-11-29T19:55:41Z

Of course the above would be a much bigger undertaking.

It would, and there'd likely be a bunch of edge-cases that would make all of that much more difficult than it seems at first.

Considering that these very large documents hopefully should not be all that common, I just figured that simply utilizing the now existing PAGE-scrolling mode seemed like a quick and easy win (especially compared to the current state of things).

timvandermeij · 2021-12-02T19:02:30Z

For such large documents, this seems like a very reasonable approach to me; in any case it's better than the document being inaccessible. If necessary there can always be a follow-up to implement a smarter method, but this should already improve the current situation. Thank you for doing this!

Snuffleupagus added 2 commits November 29, 2021 13:13

Snuffleupagus added viewer performance labels Nov 29, 2021

timvandermeij approved these changes Dec 2, 2021

View reviewed changes

timvandermeij merged commit e9e4b91 into mozilla:master Dec 2, 2021

Snuffleupagus deleted the force-PAGE-scrolling branch December 2, 2021 19:40

Snuffleupagus mentioned this pull request Jul 2, 2024

[bug]: newer versions of PDF.js (the underlying PDF viewer used by Zotero and Firefox) force page-by-page scrolling mode #18364

Closed

This comment was marked as duplicate.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce PAGE-scrolling for very large/long documents (bug 1588435, PR 11263 follow-up) #14324

Enforce PAGE-scrolling for very large/long documents (bug 1588435, PR 11263 follow-up) #14324

Snuffleupagus commented Nov 29, 2021

Snuffleupagus commented Nov 29, 2021

pdfjsbot commented Nov 29, 2021

pdfjsbot commented Nov 29, 2021

brendandahl commented Nov 29, 2021

brendandahl commented Nov 29, 2021

Snuffleupagus commented Nov 29, 2021 •

edited

Loading

timvandermeij commented Dec 2, 2021

This comment was marked as duplicate.

Enforce PAGE-scrolling for *very* large/long documents (bug 1588435, PR 11263 follow-up) #14324

Enforce PAGE-scrolling for *very* large/long documents (bug 1588435, PR 11263 follow-up) #14324

Conversation

Snuffleupagus commented Nov 29, 2021

Snuffleupagus commented Nov 29, 2021

pdfjsbot commented Nov 29, 2021

From: Bot.io (Linux m4)

Received

pdfjsbot commented Nov 29, 2021

From: Bot.io (Linux m4)

Success

Published

brendandahl commented Nov 29, 2021

brendandahl commented Nov 29, 2021

Snuffleupagus commented Nov 29, 2021 • edited Loading

timvandermeij commented Dec 2, 2021

This comment was marked as duplicate.

Enforce PAGE-scrolling for very large/long documents (bug 1588435, PR 11263 follow-up) #14324

Enforce PAGE-scrolling for very large/long documents (bug 1588435, PR 11263 follow-up) #14324

Snuffleupagus commented Nov 29, 2021 •

edited

Loading