-
Notifications
You must be signed in to change notification settings - Fork 60
[Markdown] Fix HTML comment parser. #2121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
RegExp had catastrophic backtracking. Also didn't match the spec if linked to. (Which is still CM 30, not 31.2, they differ.) Fixed a few other incorrect parsings. - The content of `<?...?>`, `<!a...>` and `<![CDATA[...]]>` can contain newlines. Changed `.` to `[^]`. - The `<![CDATA[` tag is case sensitive. Changed RegExp to be case sensitive, so added `A-Z` to all the `a-z`s used in that regexp.
Package publishing
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation. |
PR Health
Breaking changes
|
Package | Change | Current Version | New Version | Needed Version | Looking good? |
---|---|---|---|---|---|
markdown | Breaking | 7.3.0 | 7.3.1-wip | 8.0.0 Got "7.3.1-wip" expected >= "8.0.0" (breaking changes) |
This check can be disabled by tagging the PR with skip-breaking-check
.
Changelog Entry ✔️
Package | Changed Files |
---|
Changes to files need to be accounted for in their respective changelogs.
Coverage ✔️
File | Coverage |
---|---|
pkgs/markdown/lib/src/inline_syntaxes/inline_html_syntax.dart | 💚 100 % |
pkgs/markdown/lib/src/patterns.dart | 💚 100 % |
This check for test coverage is informational (issues shown here will not fail the PR).
API leaks ✔️
The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
Package | Leaked API symbols |
---|
License Headers ✔️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
Files |
---|
no missing headers |
All source files should start with a license header.
Unrelated files missing license headers
Files |
---|
pkgs/bazel_worker/benchmark/benchmark.dart |
pkgs/bazel_worker/example/client.dart |
pkgs/bazel_worker/example/worker.dart |
pkgs/benchmark_harness/integration_test/perf_benchmark_test.dart |
pkgs/boolean_selector/example/example.dart |
pkgs/clock/lib/clock.dart |
pkgs/clock/lib/src/clock.dart |
pkgs/clock/lib/src/default.dart |
pkgs/clock/lib/src/stopwatch.dart |
pkgs/clock/lib/src/utils.dart |
pkgs/clock/test/clock_test.dart |
pkgs/clock/test/default_test.dart |
pkgs/clock/test/stopwatch_test.dart |
pkgs/clock/test/utils.dart |
pkgs/coverage/lib/src/coverage_options.dart |
pkgs/html/example/main.dart |
pkgs/html/lib/dom.dart |
pkgs/html/lib/dom_parsing.dart |
pkgs/html/lib/html_escape.dart |
pkgs/html/lib/parser.dart |
pkgs/html/lib/src/constants.dart |
pkgs/html/lib/src/encoding_parser.dart |
pkgs/html/lib/src/html_input_stream.dart |
pkgs/html/lib/src/list_proxy.dart |
pkgs/html/lib/src/query_selector.dart |
pkgs/html/lib/src/token.dart |
pkgs/html/lib/src/tokenizer.dart |
pkgs/html/lib/src/treebuilder.dart |
pkgs/html/lib/src/utils.dart |
pkgs/html/test/dom_test.dart |
pkgs/html/test/parser_feature_test.dart |
pkgs/html/test/parser_test.dart |
pkgs/html/test/query_selector_test.dart |
pkgs/html/test/selectors/level1_baseline_test.dart |
pkgs/html/test/selectors/level1_lib.dart |
pkgs/html/test/selectors/selectors.dart |
pkgs/html/test/support.dart |
pkgs/html/test/tokenizer_test.dart |
pkgs/html/test/trie_test.dart |
pkgs/html/tool/generate_trie.dart |
pkgs/pubspec_parse/test/git_uri_test.dart |
pkgs/stack_trace/example/example.dart |
pkgs/watcher/test/custom_watcher_factory_test.dart |
pkgs/yaml_edit/example/example.dart |
Pull Request Test Coverage Report for Build 16026605421Details
💛 - Coveralls |
|
||
InlineHtmlSyntax() | ||
: super(_pattern, startCharacter: $lt, caseSensitive: false); | ||
: super(_pattern, startCharacter: $lt, caseSensitive: true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The <![CDATA[
tag is case sensitive, so the RegExp should be too.
Added A-Z
to all the a-z
s used here and in namedTagDefinition
.
final html = markdownToHtml(input); // Should not hang. | ||
expect(html, isNotNull); // To use the output. | ||
final elapsed = time.elapsedMilliseconds; | ||
expect(elapsed, lessThan(10000)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the expected runtime now? Is there a big enough margin for low powered machines running this test?
Idea: Rather than using a fixed timeout, we measure 1, 2, 3, and 4 paragraphs and verify that the runtime grows roughly linearly instead of exponentially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be linear in the size of the input like all the other RegExps. Which means it'll drown in the noise of everything else that is being done.
In inserted a print(elapsed);
in the test, with four -
lines, and ran it five times. It took times in the range 115-130 ms.
Then I added 16 more entries, 5-doubling the size and ran the test again. That took 133-162 ms.
For the heck of it, I 5-doubled it again, adding 80 more entries, and it now took 142-186 ms.
The time taken by running that RegExp is trivial. I'd guess compiling it at runtime takes longer than running it on any reasonable example.
But why guess when I can check it!
I put a for (var i = 0; i < 2; i++){...}
around the code of the test, to run it again after the RegExp has already been compiled.
The first run of each test takes the same kind of time 137-183 ms.
The second run takes 4-14 ms. For the 100-entry sample text.
Running the RegExp is not taking any significant time. Checking that it is linear is going to take a lot of size, 1000+ lines, before it's even measurably above noise-level (4-14 ms is a 3.5x variance).
I'm not worried. This test just checks that we don't revert the RegExp accidentally.
Revisions updated by `dart tools/rev_sdk_deps.dart`. ai (https://github.com/dart-lang/ai/compare/64dfa7f..9b007b3): 9b007b3 2025-07-07 Jacob MacDonald Add failure reasons to tool call analytics events (dart-lang/ai#219) c8dc5da 2025-07-07 Jacob MacDonald don't bail early when running in multiple roots (dart-lang/ai#218) 2541b6c 2025-07-02 Kenzie Davisson Remove VS Code mcp instructions in favor of Dart-Code setting. (dart-lang/ai#206) 70daa1f 2025-07-02 Jacob MacDonald release dart_mcp 0.3.0 (dart-lang/ai#216) a252a46 2025-07-01 Jacob MacDonald add retry logic to try and make dtd_test less flaky (dart-lang/ai#214) 9e0b973 2025-07-01 Jacob MacDonald add a test that the arg parser library only depends on package:args (dart-lang/ai#213) http (https://github.com/dart-lang/http/compare/e70a41b..7d2d87e): 7d2d87e 2025-07-02 Brian Quinlan Fix `Connection reset by peer` in protocol error tests (dart-lang/http#1786) i18n (https://github.com/dart-lang/i18n/compare/ab90327..42c4932): 42c49328 2025-07-07 Googler No public description 87fd0156 2025-07-07 Michael Goderbauer [intl4x] Re-enable Windows (dart-lang/i18n#986) 912a7720 2025-07-07 Copybara-Service Merge pull request `#985` from dart-lang:fixConstantEvaluator 52f5beeb 2025-07-07 Moritz Small cleanups in intl4x (dart-lang/i18n#988) 6e8ef245 2025-07-07 Moritz squash sync_http (https://github.com/dart-lang/sync_http/compare/dc54465..c07f96f): c07f96f 2025-07-03 Kevin Moore Update to latest lints, required Dart 3.7 (google/sync_http.dart#55) tools (https://github.com/dart-lang/tools/compare/7bf22c9..6282b35): 6282b35e 2025-07-03 Lasse R.H. Nielsen [Markdown] Fix HTML comment parser. (dart-lang/tools#2121) web (https://github.com/dart-lang/web/compare/3e11172..fb8a149): fb8a149 2025-07-07 Nikechukwu Add Support for Configuration of Dart JS Interop Gen (dart-lang/web#386) Change-Id: Ib243021ed77846a8451f60fa320e5cf40e85aa27 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/439320 Commit-Queue: Konstantin Shcheglov <[email protected]> Auto-Submit: Devon Carew <[email protected]> Reviewed-by: Konstantin Shcheglov <[email protected]>
See #2119
Fix performance and correctness of HTML comment parser.
RegExp had catastrophic backtracking. Also didn't match the spec if linked to.
(Which is still CM 30, not 31.2, they differ.)
Fixed a few other incorrect parsings.
<?...?>
,<!a...>
and<![CDATA[...]]>
cancontain newlines. Changed
.
to[^]
.<![CDATA[
tag is case sensitive. Changed RegExp to becase sensitive, so added
A-Z
to all thea-z
s used in thatregexp.