Skip to content

[Bug]: Cannot obtain text info with custom fonts #20007

Open
@fnlctrl

Description

@fnlctrl

Attach (recommended) or Link to PDF file

CustomFont.pdf

Web browser and its version

Chrome 137.0.7151.103

Operating system and its version

macOS 15.5

PDF.js version

v5.3.42 (the verison seen in console log on https://mozilla.github.io/pdf.js/web/viewer.html)

Is the bug present in the latest PDF.js version?

Yes

Is a browser extension

No

Steps to reproduce the problem

  1. Open the pdf with official pdf.js viewer
  2. Try to select all text, copy, and paste somewhere else

What is the expected behavior?

The original text can be copied and pasted correctly

What went wrong?

The pasted text is missing characters

Link to a viewer

No response

Additional context

Calling page.getTextContent() on the page also returns text with missing characters as '\x00'. Same for page.getOperatorList()
Image

When using Chrome, the pasted text is correct, so the pdf shouldn't be corrupt.
When using macOS Preview app, the pasted text doesn't lose characters but has extra space between, so there's probably something peculiar about this pdf

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions