-
Notifications
You must be signed in to change notification settings - Fork 0
PDF.js does not recognize semantic markup in PDFs #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
PDF.js appears to making recent improvements to read the tag tree:
Corey at OSU has emailed us to report that using the latest PDF.js pre-release v2.9.359 may work quite well. |
Thanks for the update Matt. Per the release notes (https://github.com/mozilla/pdf.js/releases/tag/v2.9.359), there are some significant changes to rendering of the hidden text layer in this release:
This has the potential to impact anchoring existing annotations made with Hypothesis, so we need to test this carefully before we can ship this change. |
This has been an issue for so long. Completely awesome if this really is a substantial improvement. Obviously we need to understand the impact any changes would have. However, assuming that:
I think the decision should probably be to proceed anyway (assuming there isn't some magic solution, needing implementation, that would allow us both to proceed and to be able to successfully reanchor historic annotations). We're still in a kind of happy early state where the large majority of annotations are freshly made on documents each semester, and neither students nor teachers are able to return to the ones they've made earlier in a prior course. That will soon change w/ course copy functionality (at some point) allowing teachers to copy forward annotations made as scaffolding on documents they teach regularly, and also any features that allow students to claim and preserve annotations they make during courses. Obviously w/ > 25 million annotations now, made over the course of 7 years or so, there may be some pain-- but moving towards better tech for the billions of annotations that will follow probably gets the vote. |
Internal Slack convos for reference: |
Solved with update to latest PDF.js hypothesis/pdf.js-hypothes.is@0fc20ea |
As reported to us during a meeting with the Ohio State University accessibility team, Hypothesis -- using PDF.js as its PDF viewer -- does not recognize or make visible any semantic markup or tagging (the tag tree) that may be employed by the PDF author. And thus, any such tagging is opaque to screen readers or other adaptive technology tools.
This is a large barrier to being able to meet the accessibility requirements at OSU, and a considerable gap in our striving to meet the accessibility needs of all our users.
The text was updated successfully, but these errors were encountered: