Skip to content

Better spacing in text layer. #6588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 3, 2015
Merged

Conversation

yurydelendik
Copy link
Contributor

Alternative solution for #6019 (and fixes its PDF; attaching cleaned up version by @Snuffleupagus )

Improved /test/pdfs/pdf.pdf page 3 layout:
_screen shot 2015-11-02 at 8 43 45 am

@yurydelendik
Copy link
Contributor Author

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Nov 2, 2015

From: Bot.io (Linux)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://107.21.233.14:8877/69081d6b39fd047/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Nov 2, 2015

From: Bot.io (Windows)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://107.22.172.223:8877/fe827d75b5732d7/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Nov 2, 2015

From: Bot.io (Windows)


Failed

Full output at http://107.22.172.223:8877/fe827d75b5732d7/output.txt

Total script time: 19.17 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.22.172.223:8877/fe827d75b5732d7/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented Nov 2, 2015

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/69081d6b39fd047/output.txt

Total script time: 20.11 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.21.233.14:8877/69081d6b39fd047/reftest-analyzer.html#web=eq.log

@Snuffleupagus
Copy link
Collaborator

Generally this is looking like a really nice improvement, but I happened to notice that in ìssue3205r.pdf this patch causes a slight regression in the textLayer:
https://www.dropbox.com/s/b7va9xuuodkrar3/issue3205r_spacing.png?dl=0

It might be an idea to add issue3205r.pdf as a text test as well!?

@yurydelendik
Copy link
Contributor Author

I happened to notice that in ìssue3205r.pdf this patch causes a slight regression in the textLayer

This is a really bad case for us -- it overprints text at the same spot: first is prints "log log", then variable names, and after that operators. That makes it a really bad (I would say corrupted) PDF since it's not preserving a reading order. We can safely discard this a regression (it did not work properly in first place), and we probably have few issues already opened for this particular case.

"file": "pdfs/issue6019.pdf",
"md5": "7a2e5dda3b0fc5c2e9060e378a8cdc4e",
"rounds": 1,
"type": "eq"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a text test instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a text test. Thank you.

@Snuffleupagus
Copy link
Collaborator

We can safely discard this a regression (it did not work properly in first place), and we probably have few issues already opened for this particular case.

That seems fine by me, thanks for the explanation!

However, I'm also wondering about the test changes in issue5972.pdf, since they seem to regress issue #5972. Is that another case of a weird PDF file, or does it need to be addressed before we land this?

@yurydelendik
Copy link
Contributor Author

Hmmm, that tough one, looking at the screenshot below I might say that #5995 was a regression ("after" means after #6588 patch):

_screen shot 2015-11-03 at 7 36 00 am

Looking at start of the letters, #5995 started moving all letters to the right due to artificial padding at the front of the text run. Per spec: "When the glyph for each character in the string is rendered, Tc shall be added ..."

IMHO to fix #5972, is last char and word spacing is remembered to be used as a right padding of the div block in the text layer. What do you prefer: address it here, re-open #5972 or perhaps in #6590?

@@ -1807,7 +1814,7 @@
"link": false,
"type": "load",
"password": "\u00E6\u00F8\u00E5",
"about": "The password (���) is UTF8 encoded."
"about": "The password (���) is UTF8 encoded."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: there appears to be some kind of encoding issue here.

@Snuffleupagus
Copy link
Collaborator

IMHO to fix #5972, is last char and word spacing is remembered to be used as a right padding of the div block in the text layer. What do you prefer: address it here, re-open #5972 or perhaps in #6590?

I've got no real preference; do whatever you think is the easiest :-)

If you choose to handle it in #6590, feel free to land the current PR with r=me (after fixing the encoding issue in test_manifest.json), since other than that it looks really good.

@yurydelendik
Copy link
Contributor Author

/botio makeref

@pdfjsbot
Copy link

pdfjsbot commented Nov 3, 2015

From: Bot.io (Windows)


Received

Command cmd_makeref from @yurydelendik received. Current queue size: 0

Live output at: http://107.22.172.223:8877/74d459636fca8b9/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Nov 3, 2015

From: Bot.io (Linux)


Received

Command cmd_makeref from @yurydelendik received. Current queue size: 0

Live output at: http://107.21.233.14:8877/15296db5c959b5a/output.txt

yurydelendik added a commit that referenced this pull request Nov 3, 2015
Better spacing in text layer.
@yurydelendik yurydelendik merged commit 7e8dacf into mozilla:master Nov 3, 2015
@yurydelendik
Copy link
Contributor Author

@Snuffleupagus thanks for review

@pdfjsbot
Copy link

pdfjsbot commented Nov 3, 2015

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/74d459636fca8b9/output.txt

Total script time: 19.40 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

pdfjsbot commented Nov 3, 2015

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/15296db5c959b5a/output.txt

Total script time: 20.28 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants