Skip to content

[api-minor] Support accessing both the original and modified PDF fingerprint #13661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 3, 2021

Conversation

Snuffleupagus
Copy link
Collaborator

The PDF.js API has only ever supported accessing the original file ID, however the second one that (should) exist in modified documents have thus far been completely inaccessible through the API.
That seems like a simple oversight, caused e.g. by the viewer not needing it, since it really shouldn't hurt to provide API-users with the ability to check if a PDF document has been modified since its creation.[1]

Please refer to https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G13.2261661 for additional information.

For an example of how to update existing code to use the new API, please see the changes in the web/app.js file included in this patch.

Please note: While I'm not sure if we'll ever be able to remove the old PDFDocumentProxy.fingerprint getter, given that it's existed since "forever", that probably isn't a big deal given that it's now limited to only GENERIC-builds.


[1] Although this obviously depends on the PDF software following the specification, by updating the second file ID as intended.

…erprint

The PDF.js API has only ever supported accessing the original file ID, however the second one that (should) exist in *modified* documents have thus far been completely inaccessible through the API.
That seems like a simple oversight, caused e.g. by the viewer not needing it, since it really shouldn't hurt to provide API-users with the ability to check if a PDF document has been modified since its creation.[1]

Please refer to https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G13.2261661 for additional information.

For an example of how to update existing code to use the new API, please see the changes in the `web/app.js` file included in this patch.

*Please note:* While I'm not sure if we'll ever be able to remove the old `PDFDocumentProxy.fingerprint` getter, given that it's existed since "forever", that probably isn't a big deal given that it's now limited to only `GENERIC`-builds.

---
[1] Although this obviously depends on the PDF software following the specification, by updating the second file ID as intended.
@Snuffleupagus
Copy link
Collaborator Author

/botio unittest

@pdfjsbot
Copy link

pdfjsbot commented Jul 3, 2021

From: Bot.io (Linux m4)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/bcfb08fa6b25559/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Jul 3, 2021

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://3.101.106.178:8877/f8a17c2792616f1/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Jul 3, 2021

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/bcfb08fa6b25559/output.txt

Total script time: 3.77 mins

  • Unit Tests: Passed

@pdfjsbot
Copy link

pdfjsbot commented Jul 3, 2021

From: Bot.io (Windows)


Success

Full output at http://3.101.106.178:8877/f8a17c2792616f1/output.txt

Total script time: 5.51 mins

  • Unit Tests: Passed

@timvandermeij timvandermeij merged commit 5c7cd6f into mozilla:master Jul 3, 2021
@timvandermeij
Copy link
Contributor

Nice find; thanks!

@Snuffleupagus Snuffleupagus deleted the fingerprints branch July 4, 2021 08:31
robertknight added a commit to hypothesis/client that referenced this pull request Sep 7, 2021
In mozilla/pdf.js#13661 the API for retrieving
PDF fingerprints (aka. File Identifiers) changed. In generic builds of
PDF.js the old API remains available, but not in the non-generic one
that Firefox's built-in viewer uses.

This commit makes Hypothesis use the new API if available or the old API
otherwise. The fingerprint value should be the same in both cases.

Fixes #3673
robertknight added a commit to hypothesis/client that referenced this pull request Sep 7, 2021
In mozilla/pdf.js#13661 the API for retrieving
PDF fingerprints (aka. File Identifiers) changed. In generic builds of
PDF.js the old API remains available, but not in the non-generic one
that Firefox's built-in viewer uses.

This commit makes Hypothesis use the new API if available or the old API
otherwise. The fingerprint value should be the same in both cases.

Fixes #3673
robertknight added a commit to hypothesis/client that referenced this pull request Sep 7, 2021
In mozilla/pdf.js#13661 the API for retrieving
PDF fingerprints (aka. File Identifiers) changed. In generic builds of
PDF.js the old API remains available, but not in the non-generic one
that Firefox's built-in viewer uses.

This commit makes Hypothesis use the new API if available or the old API
otherwise. The fingerprint value should be the same in both cases.

Fixes #3673
robertknight added a commit to hypothesis/client that referenced this pull request Sep 9, 2021
In mozilla/pdf.js#13661 the API for retrieving
PDF fingerprints (aka. File Identifiers) changed. In generic builds of
PDF.js the old API remains available, but not in the non-generic one
that Firefox's built-in viewer uses.

This commit makes Hypothesis use the new API if available or the old API
otherwise. The fingerprint value should be the same in both cases.

Fixes #3673
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants