Fix parsing for minified css with extra semicolons #2

udhavsethi · 2019-05-06T11:12:28Z

This fixes the scenario where extra semicolons are immediately
followed (without whitespace) by a boundary token such as '}'.

For the ruleset: .foo{bar:baz;;}div .qux{vim:fet;}

Shady treats the first `}' as an extra token and adds div .qux rule
under the .foo selector.

This fixes the scenario where extra semicolons are immediately followed (without whitespace) by a boundary token such as '}'. For the ruleset: .foo{bar:baz;;}div .qux{vim:fet;} Shady treats the first `}' as an extra token and adds div .qux rule under the .foo selector.

hewholived · 2019-05-06T15:18:33Z

src/shady-css/parser.ts

@@ -131,7 +131,7 @@ class Parser {
    }

    while (tokenizer.currentToken &&
-           tokenizer.currentToken.is(TokenType.boundary)) {
+           tokenizer.currentToken.is(TokenType.semicolon)) {


Shouldn't you have || TokenType.boundary ?

semicolon is a boundary token. the problem is that close brace is also a boundary, so parseUnknown is too eager and swallows the closing brace while we're still in the open brace context, and subsequent statements get swallowed into the nested context.

I think this fix isn't ideal, but it's ok. it looks to me like it will bring Shady closer to actual browser css error recovery, but there are still a lot of error conditions that Shady will not handle correctly.

[};] are property boundaries. This is significantly limiting the list. Are you sure it's going to work for all cases? Please add tests and run all tests in nv-transform-css.

consuming just semicolon instead of other boundaries is harmless, since we're in error recovery. the unconsumed tokens will just trigger parseUnknown again. So instead of a single Skipped token with all the boundary tokens, we get a series of smaller Skipped tokens.

Yes, so when it's triggering for not semicolon, what is going to happen?

browser error recovery is basically "skip until boundary token", not "skip over boundary tokens". (it's not identical to "skip until boundary token", but "skip until" is closer to what browsers do than "skip over")

Why then you bundle ";;;;" into one discard? The whole while loop can go away then.

yes, I think the actual correct version is a while loop with a token != boundary. but I don't know if that will break expectations elsewhere.

I had the same idea, that a while loop with token!= boundary would be the apt solution (since that is what is suggested to be graceful recovery). I don't think that solution is logically correct. Consider, for example, the following CSS:

host { margin: 0;; padding: 0; }

With the token!= boundary loop:
The first semicolon is consumed as part of the margin rule,
The second semicolon triggers parseUnknown,
which consumes the semicolon and all the tokens up to the next boundary token, which is ":". So the 'padding' token gets consumed too.

I also had the idea of removing the while loop altogether, since it is not doing much. But this breaks a lot of tests since they seem to be accounting for not only the valid rules but the discarded ones too.
For example, the test called disregards extra semi-colons checks for nodeFactory.discarded(';;'), which would fail if the while loop is discarded and the unknown tokens were parsed one by one.
There are a number of other tests that fail on removing the loop and my intention was to fix the issue with low touch.

maribezler · 2019-05-06T18:05:53Z

src/test/fixtures.ts

@@ -68,6 +68,8 @@ div {

 export const minifiedRuleset = '.foo{bar:baz}div .qux{vim:fet;}';

+export const minifiedRulesetWithExtraSemicolons = '.foo{bar:baz;;}div .qux{vim:fet;}';


Why extraSemicolons test is not enough L52? Is there anything special about minified semicolons? New line is not a property boundary or anything. All is missing - spaces.

parseUnknown current consumes a sequence of adjacent boundary tokens.

When shady sees ";;\n}",
the first semicolon is consumed as part of the rule,
the second semicolon is unrecognized and triggers parseUnknown,
which consumes the semicolon but stops at the newline since newline is not a boundary token.

When shady sees ";;}\n",
parseUnknown will consume the closing brace too, which is the problem

maribezler · 2019-05-06T18:13:29Z

src/shady-css/parser.ts

@@ -131,7 +131,7 @@ class Parser {
    }

    while (tokenizer.currentToken &&
-           tokenizer.currentToken.is(TokenType.boundary)) {
+           tokenizer.currentToken.is(TokenType.semicolon)) {


[};] are property boundaries. This is significantly limiting the list. Are you sure it's going to work for all cases? Please add tests and run all tests in nv-transform-css.

hewholived reviewed May 6, 2019

View reviewed changes

flinst approved these changes May 6, 2019

View reviewed changes

maribezler requested changes May 6, 2019

View reviewed changes

maribezler approved these changes May 7, 2019

View reviewed changes

maribezler merged commit bf4ccfd into maribezler:master May 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix parsing for minified css with extra semicolons #2

Fix parsing for minified css with extra semicolons #2

Uh oh!

udhavsethi commented May 6, 2019

Uh oh!

hewholived May 6, 2019

Uh oh!

flinst May 6, 2019

Uh oh!

maribezler May 6, 2019

Uh oh!

flinst May 6, 2019

Uh oh!

maribezler May 6, 2019

Uh oh!

flinst May 6, 2019

Uh oh!

maribezler May 6, 2019

Uh oh!

flinst May 6, 2019

Uh oh!

udhavsethi May 7, 2019 •

edited

Loading

Uh oh!

udhavsethi May 7, 2019 •

edited

Loading

Uh oh!

maribezler May 6, 2019

Uh oh!

flinst May 6, 2019

Uh oh!

maribezler May 6, 2019

Uh oh!

Uh oh!

		@@ -68,6 +68,8 @@ div {

		export const minifiedRuleset = '.foo{bar:baz}div .qux{vim:fet;}';

		export const minifiedRulesetWithExtraSemicolons = '.foo{bar:baz;;}div .qux{vim:fet;}';

Fix parsing for minified css with extra semicolons #2

Fix parsing for minified css with extra semicolons #2

Uh oh!

Conversation

udhavsethi commented May 6, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

udhavsethi May 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

udhavsethi May 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

udhavsethi May 7, 2019 •

edited

Loading

udhavsethi May 7, 2019 •

edited

Loading