Skip to content

security: replace vulnerable regex with parser #1223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 17, 2018
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 42 additions & 3 deletions lib/marked.js
Original file line number Diff line number Diff line change
Expand Up @@ -554,9 +554,48 @@ inline.normal = merge({}, inline);
inline.pedantic = merge({}, inline.normal, {
strong: /^__(?=\S)([\s\S]*?\S)__(?!_)|^\*\*(?=\S)([\s\S]*?\S)\*\*(?!\*)/,
em: /^_(?=\S)([\s\S]*?\S)_(?!_)|^\*(?=\S)([\s\S]*?\S)\*(?!\*)/,
link: edit(/^!?\[(label)\]\(\s*<?([\s\S]*?)>?(?:\s+(['"][\s\S]*?['"]))?\s*\)/)
.replace('label', inline._label)
.getRegex(),
link: {
exec: function (s) {
Copy link
Member

@joshbruce joshbruce Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding some doc blocks to introduce the why behind some of this...nothing too major, just to help those new to the code.

// [TEXT](DESTINATION)
var generalLinkRe = edit(/^!?\[(label)\]\((.*?)\)/)
.replace('label', inline._label)
.getRegex();

function unwrapCarats (str) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carets?? Carat and carrot are different. :)

Angle brackets might be most appropriate if I'm reading the regex correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha I was wondering if I was spelling that right. I'll switch to AngleBrackets anyway.

if (str.match(/^<.*>$/)) {
str = str.substr(1, str.length - 1);
}
return str;
}

var fullMatch = generalLinkRe.exec(s);
if (fullMatch) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you flip this so that the if statement is smaller:

if (!fullMatch) {
  return null;
}
// ... split and such here
return [fullMatch[0], text, destinationAndTitle[0], destinationAndTitle[1]];

var text = fullMatch[1];
var destination = fullMatch[2];

var m;

var destinationAndTitleRe = /^([^'"(]*[^\s])\s+(['"(].*['")])/;
if (m = destinationAndTitleRe.exec(destination)) {
// <destination> -> destination
var dest1 = m[1].trim();
dest1 = unwrapCarats(dest1);
var title1 = m[2];
return [fullMatch[0], text, dest1, title1];
}

var destinationRe = /^(<?[\s\S]*>?)/;
if (m = destinationRe.exec(destination)) {
Copy link
Member

@styfle styfle Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two if blocks are nearly identical. Can you make a common function for those? Something like this:

function getMatch(r, fullMatch) {
  var m = r.exec(fullMatch[2]);
  if (m) {
    var dest = unwrapAngleBrackets(m[1].trim());
    var title = m[2];
    return [fullMatch[0], fullMatch[1], dest, title];
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but not a deal breaker for my review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// <destination> -> destination
var dest2 = m[1].trim();
destination = unwrapCarats(dest2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destination is assigned but never used. Is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Will fix.

var title2 = '';
return [fullMatch[0], text, dest2, title2];
}
}
return null;
}
},
reflink: edit(/^!?\[(label)\]\s*\[([^\]]*)\]/)
.replace('label', inline._label)
.getRegex()
Expand Down